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‘One function of theory is to develop methods for solving 
tctical problems. Another is to organize and to make 
mprehensible the domain of knowledge in which the 
feory operates. Those who apply theory place their own 
Llues on these two functions, but all, in fact, use both. 
» be solved, even the most practical of problems must 
Laas in the terms of some underlying logical structure; 
| solve it, even the most practical of engineers must 
mprehend this structure and understand the terms 
‘ereof. 
The theory of communication offers the engineer a 
wical structure founded upon circuit theory and upon 
.e theory of probability. Within this structure, it offers 
m methods for solving certain practical, and even 
ypractical, problems. These are problems of detection. 
lhe theory recognizes the operation of detection as that 
decision or measurement in the presence of error. This 
exactly the problem to which the efforts of mathematical 
atisticians have been directed since early in the century. 
hus the tools of the statistician are available to the 
igineer; he can now design detection systems, optimum 
y one of several criteria, for signals perturbed by one of 
-veral different types of noise. 

For the case in which the signals to be detected are 
me-series perturbed by added noise, statistical tools 
re particularly sharp; theory offers in this limited domain 
ethods of such elegance and power that they transcend 
ven the challenge of the practical problems for which 
1ey were first conceived. 

It is scarcely an understatement to say that problems 
‘ detection are the only ones for which statistical theory 
ow gives the engineer specific methods of solution. 
heory has, however, gone farther than this in organizing 
is thought. 

In the first place, the theory of detection itself can be, 
ad recently has been by Middleton and van Meter, 
abedded in the more general theory of statistical decision, 
; developed by the late A. Wald and others. This latter 
leory, in turn, allows a very general measure of the 
srformance of a detector, derives general relations among 
ethods of detection, and in some cases delimits the 
ymain within which the engineer must search to optimize 
s chosen measure of performance. All results at present 
late to the case in which the list of alternative decisions; 
., the list of signals, is pregiven and fixed. 

This decision theory of Wald provides a pattern within 
hich one can account for the undesirability of errors in 
tection by assigning almost arbitrarily an appropriate 
st to each error, and to indecision. The cost of error, 
ywever, is not the only cost with which an engineer 
ust cope. He also has to reckon the cost of his equipment, 
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either as a first cost, or as an operating cost measured 
either in money or more abstractly in terms of complexity, 
liability to failure, or sensitivity to deviations in com- 
ponents. Most such costs depend critically upon the 
accidents of available technology; it is too much to ask 
that a general theory account for them in detail. Yet 
there is one cost of this kind for which an effective account- 
ing is possible. 

When dealing with communication as a service—as dis- 
tinct from communication as an adjunct to measurement 
or control—the engineer can fairly assume that revenue 
is a significant negative contributor to operating cost, 
and further, that the rate of revenue is proportional to 
the number of messages handled in a given time. Motivated 
by these assumptions, the work of Shannon comprehends 
in a single structure the engineer’s view of the process of 
communication as a service. 

Here appears the quantity of primary interest to 
such communication, namely, the communication rate in 
effective characters per unit time. Here appears the oper- 
ation of modulation, in recognition that the list of 
alternative signals is not pregiven and fixed; indeed, the 
engineer designing a communication system has control, 
not merely over the mode of detection, but also over the 
actual form of the signals to be detected. Here also 
appears, and for the first time, the general concept of 
the received signal as a random entity merely distributed 
jointly with the transmitted signal. 

Within this framework, Shannon’s theory now exhibits 
a least upper bound to the achievable rates of communi- 
cation. The bound is a quantity intrinsic to the medium 
of communication and its accompanying noise; to attain 
it exploits the full freedom of the engineer both to detect 
and to modulate optimally. 

This theory also leaves room to discuss another ab- 
stractly measured element of (positive) cost: the 
complexity of terminals as measured by the delay required 
for the processes of modulation and detection. Little is 
now known, in general, about this latter quantity. 

Most of the definitive statistical results in the theories 
just sketched describe the optima which can be achieved 
by the designer who knows exactly, and is able to exploit 
to its utmost, the statistical description of the universe 
within which his design must operate. Without denying 
the value of such results to the designer, and certainly 
without belittling the present accomplishments of theory, 
one can still urge consideration of other of the designer’s 
problems. 

In the first place, practically, the designer has only 
limited control over his medium. In principle, present 
theory has room for this phenomenon, since failure of the 
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implementation to realize exactly the designer’s plan 
can be regarded as a form of noise. Unfortunately, we 
have at present no attractive or tractable general models 
for such noise; furthermore, except perhaps in extended 
systems of many similar tandem links, this kind of noise 
lacks the important simplifying property of ergodicity. 
There appears here to be a need both for new concepts 
and for new methods. 

A fundamental weakness in present theory is the 
assumption of perfect knowledge on the part of the de- 
signer. In fact, design itself is a statistical decision. 
Rationally, it demands measurements of, rather than 
assumptions about, the statistical universes involved; 
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this places it within the purview of Wald’s theory @ 
decision, or, alternatively, of the theory of games. T 
what extent the general results of these theories can b 
usefully adapted to problems of design remains an opet 
and interesting question. 

Practically, because of the limitations, already noted 
of his basic data and of his available implementation 
the working engineer generally cannot now justify striviny 
for the ultimate optima to which present theory can leac 
him. Even in the absence of major improvements it 
theory, he could profit from the development of method: 
which are conceptually simple and demonstrably in 
sensitive to assumptions. 


Presence 


_Summary—This paper treats the detection of pulsed signals in 
he presence of receiver noise for the case of randomly fluctuating 
gna strength. The system considered consists of a predetection 
rage, a square law envelope detector, and a linear postdetection 
ntegrator. The main problem is the calculation of the probability 
ie function of the output of the postdetection integrator. The 
alysis is carried out for a large family of probability density 
inctions of the signal fluctuations and for very general types of 
orrelation properties of the signal fluctuations. The effects of 
onuniform beam shape and of nonuniform weighting of pulses by 
ne postdetection integrator are also taken into account. The function 
rhich is actually evaluated is the Laplace transform of the prob- 
bility density function of the integrator output. In many of the 
ases treated, the resulting Laplace transform has an inverse of 
nown form. In such cases the evaluation of the probability density 
unction would require the computation of a finite number of 
onstants; in practice this would usually require the use of com- 
uting machinery, but would be perfectly feasible with presently 
vailable computing machinery. 


| N EXTENSIVE treatment of detection theory 
for pulsed signals in noise, for the case where the 
signal amplitude is constant, was given by 
Jarcum.'’” This analysis has been extended by other 
ivestigators’” to some cases where the signal pulse 
mplitudes are randomly modulated. However, almost 
ll analyses to date have dealt with just two cases, insofar 
s the correlation properties of the signal fluctuations are 
oncerned: the signal amplitudes fluctuate independently 
rom pulse to pulse, or the signal amplitudes are constant 
luring the integration time of the receiver but are inde- 
endent from one integration period to the next. Fluctua- 
ions not conforming to one of these assumptions have 
een treated only in very special cases.* The purpose of 
his paper is to extend the analysis to much more general 
ases, insofar as the correlation properties of the fluctua- 
ions are concerned. The analysis is carried out for a 
urge family of probability density functions for the 
ignal fluctuations. Also, since no additional work is 
nvolved, we treat the case where the postdetection 
ategrator forms a weighted sum of the input pulses. 
We shall consider a receiver consisting of a predetection 
tage, a square law envelope detector, and a linear post- 


* Manuscript received by the PGIT, January 18, 1957. 

+ Control Systems Lab., University of Illinois, Urbana, II. 

1J. I. Marcum, “‘A Statistical Theory of Target Detection by 
ulsed Radar,’ The RAND Corp., Res. Memo. RM-754; Decem- 
er 1, 1947. 

2J. I. Marcum, “A Statistical Theory of Target Detection by 
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Detection of Fluctuating Pulsed Signals in the 


of Noise’ 


PETER SWERLINGT 


detection integrator. The receiver noise at the detector 
input is assumed to be additive Gaussian (with zero mean), 
completely correlated for times of the order of one pulse 
width and completely uncorrelated from one pulse to the 
next. The detector is assumed to be a square law envelope 
detector. Tor mathematical convenience, the detector 
output is assumed to be normalized as follows: detector 
output equals input envelope squared divided by twice 
the input mean square receiver noise voltage. This normali- 
zation was used by Marcum”’ * and followed by Swerling;*? 
it simplifies some of the formulas, but results in no actual 
loss of generality. 

Denoting by v; the (normalized) zth pulse emerging 
from the detector, we assume the postdetection integrator 
to form the following weighted sum: 

N 
y = integrator output = py aD; (1) 
t=1 
where a; are positive real numbers. 

We assume that the detection procedure requires the 
integrator output y to exceed a threshold Y, in order for 
detection of a signal to be announced. Here Y, isa dimen- 
sionless quantity, because of the normalization of the 
detector output described above. 

If G(y) represents the probability density function 
(pdf) for y, and if Go(y) represents the pdf in the case 
where receiver noise only is present, then 


Probability of detections if Cua 
Yo 


Probability of false alarm = [ Go(y) dy. (3) 
Yo 
In most applications, the probability of false alarm is 
set at some desired level and Y, is calculated from (3), 
then probability of detection is calculated from (2). 
In case a; = 1, all z, then G,(y) is” 
1 N-1 -y 


Gag) = wWw—pi e (ora. == dpwalles ye 


(4) 
In the general case, the Laplace transform of G)(y) is 
given by 


fo) N 


[em Goty) ay = 
0 


5-1 Ap + ie 

(Real part of p = 0.) 
This holds, of course, independent of any assumptions as 
to the signal fluctuations. 


In order to calculate G(y) in the presence of signal, it is 
necessary to specify the statistical nature of the signal 
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fluctuations. We define x, = ratio, at the detector input, 
of the signal power for the 7th pulse to the mean receiver 
noise power. 

We assume that x, is of the following form: 


L 
eo aes, (Aco 
k=1 


where L is a positive integer, and w,,; are Gaussian random 
variables with zero mean. (The x; are also assumed to be 
statistically independent of the receiver noise.) 


, N) (6) 


Define the random vectors U'”, --- , U“’ by 
= i 5 ee) (as ee eB) oe 107) 
We assume that the U“’, k = 1, --- , L, are mutually 


statistically independent. Also, it is assumed that U“’ has 
covariance matrix (¢;;)) where, denoting expected values 
by a bar, 


Ch) a rae | il OS 
Oe = Ws. iia. 


(8) 


j= dy. eee 


In radar applications, the fluctuation in x; is considered 
to be due largely to fluctuation in the scattering cross 
section of the target. To relate the above formulation to 
more familiar types of fluctuation, consider the case where 
E& = 2K and where, for each 4, uj, = up, = °°: = Us; 
Then it is easily verified that x; has a pdf w(a,; @;) given 
by 


He 1 es e (=) 
oo i i (Kk rors 1)! Xx; Xx; : P X; : 
0 


ayers 


(9) 


where exp ( ) stands for the exponential function. 

Here <; represents the average of x; over the fluctua- 
tions. For K = 1, this reduces to the familiar exponential 
or Rayleigh fluctuation for the signal power. 

Note that we do not require €; to be the same for all 7. 
This can be considered in radar applications to reflect the 
effect of beam shape. 

For present purposes, no attempt will be made to 
derive the above formulation of the signal fluctuation 
from physical considerations. Its justification is simply 
that from it one can construct a wide variety of pdf’s 
and correlation properties for the signal fluctuation. 

We shall now compute the Laplace transform C(p) of 
the pdf G(y) of the integrator output y. This is defined as 


fo) 


cM = | e"G@ dy (10) 
where p is a complex number with nonnegative real part. 
Since G(y) vanishes for y < 0, we are able to deal with 
the Laplace transform, integrating only from zero to 
infinity. 

If we consider the conditional pdf for y, for definite 
, vy, it is well known’ that the Laplace 
, ty) of this conditional pdf is 


values of 2, --: 
transform C(p|a, °°: 
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C(pla:, --: , ty) = ry Al eee Ex | pee |. (11) 
(p vy, ) tn) = oa 1 ae ap z p 1 ae a ;p 
In view of (6), this can be written 
C@lzi, a ty) 
N 
N 1 Tap 2d, Wie 
= ee ION = 12 
OG persprry b ah eG o 


C(p) is simply equal to this expression averaged over 
the probability distribution of U, --- , U'”’. Because 
of the mutual independence of the U“’, one may write 


N 1 N : 
cm =| ts] fw |- 28% 


L 
-(ui,;g Fore + ui.) | [J Pu”). (13) 


Now, assuming nonsingularity of (¢{;’) for each k, 
dP(U) a el 
(Qr)*/? A} 
N 
“exp | -3 3D EP ats | a’ (4 
i,7=1 
where 
A, = determinant (¢{%), 
(€) = matrix inverse of ({%? 
Thus, 
N il L 
eo =| 15] | 15 
(p) I] 1 top I] k (15) 
where 
1 N 
E, = aa | 4 tp Un, Un, 3 
sem LON ates Nl as gs 
~ aip (k) 
Dae oe 
ny > rare ut. dU (16) 
Now define 
ER (p) =p, tj 
(k) 2a ;p . = 
eae is ata 3) “sa 1% 
Soh eae A j (17) 
and 
T.(p) = determinant ( (p))- (18) 
(rp r,(0) = | 
UuS 1; = Ay 
Then 
R, = ES (ec 
T'.(p) (27) N/2 
N 
i exp {4 ye EC. on.s} aw). (19) 
4 ,9= 1 
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| Now suppose for the moment that p is a nonnegative 
eal number. Since the right-hand factor is then just the 
ategral of a probability density and hence equal to unity, 


(Oat 
», - [BOP 
f : I'.(p) 
Thus, provided (6$? 
verses (aap 


(20) 


) is nonsingular with matrix in- 


@) = (aah tee} 


vhere I',(p) is defined by (17) and (18). This has been 
lerived for p, a nonnegative real number, but by analytic 
ontinuation it clearly holds for all p in the closed right 
aalf plane. The only interesting singular case is that of 
tomplete correlation of the «;, which can be treated as a 
separate case.” 

It is interesting to consider some special cases. For 
bxample, suppose a; = 1, all 7. Define 


(21) 


A! (p) = ith eigenvalue of (\*(p)) (22) 
A: (0) =r,” =tth eigenvalue of (€%). 
ren, i a, = 1, alle,r,°@) =rAP’’ + wee and 
| lee 
* 2 
mie) = TI (+ Bs) 
i=1 Pp (23) 
N 
PO) Le: 
1=1 
Putting (23) into (21), for the case a; = 1, all 7, 
1 1 ~* a+ ee ah 
= pth emrreias aoaner F<. . 24 
| (p) ate ie IT}! wee G+ pa® (24) 
If the covariance matrices are all equal, ¢;;? = ¢;,, all 
Me, then d;” = X,, all k, and 
1 x | 2p ee 
Co) == 1+ ——+—— : 25 
| ®) (1 + p)” a (1 + pa; ) 
| Specializing still further, suppose that 
| Cusp 
| (i; ) = @ii), all k 
| a, =1, all i 
| Ui oreo all i. and ic (26) 


In these cases, x; is distributed according to the pdf 


(9) and 


f= ti Kos “allt, (27) 


The assumptions a; = 1, all 7, and #; = 2Ko’, all i, 
amount in radar applications to assuming a uniform beam, 
and uniform weighting of pulses by the postdetection 
integrator—assumptions almost invariably made in prob- 
ability of detection calculations. However, even the 
assumptions listed in (26) allow a large degree of freedom 
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in the choice of correlation properties and pdf’s for the 
signal fluctuations. 
Now define 


1 . 
on: 
Using (25), (27), and (28), we obtain under the conditions 
listed in (26) 


uw; = 2th eigenvalue of be = 
oO 


(28) 


c@) =a +p" T] ie 4% ss) (29) 
Pp 


where @ is defined in (27), K in (26), and yu; in (28). This 
ean also be written 


1 = | Hep | 
C ——— 1 oa 30 
COE Au + KG +p >) 
Special cases are 
= 1 
a a ap En 
- N 1 

K = 2:0) =(1+p)" T] (31) 


pan wid\ | 
f a rl es )] ; 


These formulas (31) reduce to the correct formulas’ 
when the signal strength fluctuates independently from 
pulse to pulse, in which case uw; = 1, all z. Also, they reduce 
to the correct formulas’ in the case where the signal 
strength is constant for the N pulses v,, --- , vy; in this 
case, uw; = 0,7 = I, , N-1, and wy = N. The correct 
formulas are obtained even though complete correlation 
leads to a singular covariance matrix (¢,,;); this indicates 
that the validity of (29) extends to some cases where 
(¢;;) 18 singular. 

It is interesting to note what happens to C(p) in (380) 
when K — o. Since, for K > o, the standard deviation 
of the fluctuations about the mean value goes to zero, one 
would expect C(p) to approach the form it would take for 
nonfluctuating signals. This is indeed what happens: from 
(380), as K > ~, 


1 Saye = | 
(6a ae 
But 
al d:; 
» yu; = trace of (4 =a G 
so 


1 — Np 
CQO) ex | | 
: eee eee 
(compare Marcum’). 
One can use the expansion 


a —K | a? a 
(1 3 *) SP Oona 
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to obtain the following expression 


p ; 
i lat a2 =|} (82) 


valid for | a/K | < 1, 
for C(p): 


25, = Np 
C(p) = a exp {ya 


where 


7=0 


Cs 


Eq. (82) is valid for all p 1, all 7; other- 


wise it is valid for 


p< min 
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It is useful to note that, in the cases represented by the 
assumptions listed in (26) and, in fact, in more genera 
cases, the Laplace transform C(p) can be inverted in ¢ 
straightforward, though possibly very tedious, manner 
The simplest case is, of course, K = 1 [see (31)], in whick 
case, if for example the uv; are distinct, the pdf G(y) is 0 
the form 
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Gy) = dic: exp es 


where the c; are readily determined. Since digital compute! 
programs exist for finding the eigenvalues of N X N 
matrices up to fairly high values of V, the above formulas 
can be utilized for digital machine computation of prob: 
ability of detection in a fairly straightforward way. 
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Recursion Methods’ 
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Summary—Given a set of equally spaced measurements, it is 
possible to curve fit a “least squares’ polynomial to the N observed 
data points and obtain estimates of the past, present, or future values 
of the data or its derivatives by appropriate manipulations of the 
curve fit. 

This curve fitting can be accomplished by a linear weighting of 
the observed data over an interval (n—1) 7. If the data is measured 
in real time such that a new data point is observed each T seconds, 
then the desired output (for example, the smooth or predicted value 
of the data) can be obtained by sliding these fixed number of weights 
such that the same weight always multiplies the data which is at a 
fixed lag with respect to the most recent data. Since these weights 
are zero for lags greater than n, they may be described as a fix-finite 
memory linear digital filter. 

In calculating the desired output for each new sample one requires 
a machine which can store n coefficients, n data points and performs 
n multiplications and n—1 additions in at least T seconds. The coef- 
ficients do not change but the multiplications and additions must be 
performed each 7 seconds as a new data point is measured. 

For large values of n, and small 7, this may put a severe require- 
ment on the real time solutions of the computer. This paper presents 
an alternate technique using recursion formulas to obtaining the 
same results as the n point weighting equation. The method has the 
advantage of requiring considerably less storage, multiplications 
and additions when n> 1 and the degree of the curve fitting poly- 
nomial (K) is small. 


Manuscript received by the PGIT, Ue 21, 1957. 
+ Convair Astronautics, San Diego, ‘Calif 
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Statement of the Problem 


ET y@) = PW) + N() be the input to a digital 
ile filter, where P(t) is a polynomial in ¢ of degree K, 
and N(t) is a white noise source. The data are 
sampled every 7’ units in time. It is desired to estimate 
the L™ derivative of P(t) evaluated at ft (m + a)T. 
The estimate is to be made in real time at / = mT. The 
filter has a finite memory and operates only on the present 
data point and the (n — 1) previous data points. These 
properties may be summarized by the following equations. 
Let Z,, be the output of the filter at ¢ = mT and Yusm—n 
be the input to the filter at f = (wu + m — n)T then 


= oC yee tO eee 0, +1, +2. (1) 
u=1 
The coefficients C,, are also functions of K, a, and L. 
However, these parameters are presumed fixed for a 
particular problem, and are deleted from the notation, 
to simplify the equations. The coefficients C,, are chosen 


qP57 


| the standard least squares sense: that is, 


E\Z,) = Ee) p= ene ie (2a) 
EZ, — E(Z,,))” = a minimum. (2b) 


It is assumed that H(N(t)) = 0 for all ¢, and that N(é) 
a stationary and ergodic white noise signal. 

liq. (1) can be interpreted as follows: the output Z,, 
given by a running weighted average over the n most 
ecent data points. The weights are independent of m 
hd so the same coefficients always multiply the input 
hich is at a fixed lag with respect to the most current 
ata. As an example, let 


lino, Ye eh m = 3,4 and_5. 
Z, = Citi + Coy2 + Coys,  (m = 3), (3) 
Then 
Zs = Cin + Coys + Coys,  (m = 4), 
2, Cis eC Cote, (ms =" 9), 


The use of (1) for large values of n can be burdensome. 
is the purpose of this paper to determine a recursion 
‘rocedure for computing Z,, which will be easier to apply 
han (1) for large n and small K. 

_ For the remainder of the discussion it will be assumed 
hat 7 = 1 since the required formulas and tables are 
smplest to define for this case. This imposes no essential 
striction on the generality of the solution since it is 
mly necessary to multiply the coefficients C, by the 
nagnitude of (T-”) to obtain the answer for any other 
ralue of 7, not equal to one. 


rthogonal Polynomials 


In the tables of orthogonal polynomials by Anderson 
ind Hauseman” and Fischer and Yates* are shown a 
isting of the first five polynomials and some of their 
mportant properties. Let &,,,(w) be a polynomial in wu 
of degree V, where wu takes on the integral values wu = 1, 
2, 3, --: n. Then the basic property of these polynomials 
‘'s that 


DH.) = SV, x) (a 


De Ev n(WEw n(w) =O Vey. 


The tabulated values”’® are for a function £, ,(w) which 
s obtained from £y,,(u) by the relation 


1The operator E(V) is the ensemble average of, or expected 
value of, the quantity (V). 
| 2R. L. Anderson and E. E. Hauseman, ‘Tables of Orthogonal 
olynomial Values Extended to VN =104,” Res. Bull. 297, Iowa State 
ollege, Ames, Ia.; April, 1942. 

3R. A. Fischer and E. Yates, “Statistical Tables for Biological 
Agriculture and Medical Research,” Oliver and Boyd, Ltd., Edin- 
purgh and London; 1925. 
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Wee) 


fy (WU) = Av névn(U)- (5) 


The factor \y,, appears in the solution for C,, in both the 
numerator and denominator as \j,,, and thus cancels. It 
is more convenient analytically to use &y,,,(w) for arriving 
at a solution; however, once a solution has been obtained, 
the tabulated values for &,(w) can be used for compu- 
tational purposes. 

It is shown* that if P(w) (the input polynomial) is 
represented by 


K 
Plu) = 2S dytv»(U), (6) 
then the coefficients C,, are given by, 
K (L) 
ty n(Wevy n(n SF a) 
CE os TUN OE Ns 
PE @) 
where 
ae 
fy.n(n als aera Ey n(u) ae (8) 
du 


and wu is treated as a continuous variable with respect to 
the differentiation. 

Note that C, is a polynomial of degree K in wu. And 
since the coefficient u* in &x,,(w) is unity, then 


Kityi(n + a) 
SV) 


4 = 1, 2, 3p. 


: Cia eCe = 


where DY ( ) is the difference operator which advances 
the index u, by V, of the bracketed quantity. Thus 


DAG = Cusv or ID Ores) Bae Yut+m—ntV- (10) 


Eq. (9) is the symbolic representation of the operation, 
‘“‘K th difference’ and states that since C,, is a polynomial 
of degree K in wu, the K * difference of C,, with respect to u 
is a constant as given by (9). 

Let the K* difference with respect to m be taken of 
both sides of (1). Then 


Ce aaa tye Ze = (De aie i ye CA amaa 
or 
(De —1)*Z.0= SCD. = 1 vee ee 


Expanding (D,, — 1)*, and noting (10), one obtains 


K 
SS Ce al aes V3) 


V=0 


(De. ea = ; (12) 


u=1 


4M. Blum, “An extension of the minimum mean square _pre- 
diction theory for sampled input data,” IRE Trans., Vol. IT-2, 
pp. 176-184; September, 1956. 7 ; 

5 The operation (D,_1)* is the equivalent of the A” difference 
oniG a. 


LSO 


KI 


Coir: 


Letu+ V = L,L = 1, 2,3, --- K. Then the coefficient 
Of Yn—nst Is defined as by. where 


L ie 
mat YY 23. K+u-L 
b, = 2, C-1) € — a 


u=1 


KONG : P y 
where ( y is the binomial coefficient 


(13) 


Let K + 15 u+ V S n, then applying (9), the 
coefficient Of Ym—n+x+1 tO Ym 1S given by 


Bp — KMD + @) 


S(K, n) ee 


Finally letn + 1 S wu + V S n+ K, then the co- 
efficient of ¥,.4 (d = 1, 2,3, --- K,) is given by 6,, where 


= ead IK 
bg = sy BEES (15) 
V=d 
Combining (12)-(15) one obtains 
= -o( K 
Linvk = > i A ) (16) 
v=0 V 
K TK 
aia SS Dy nae = De buY m +d 
L=1 d=1 
n—-K 
lr B ys, Ym=n+K+i- 
j=1 
n—-K 
Let Dm aa DS Um—nt E479 (17) 
j=1 
then dn = Ym — Ym-n+k + m-1, SO that (16) may be 
written 


K=1 , K 
Zm+K ae ea SD Am ea ) 
v=o Ve 


K 
a DS (Define alsa One | Ropes 


L=1 


(18) 


Numerical Examples 
Let the parameters be as follows: 
K = 2. (quadratic input), 
a= a (desired output is the smooth value of input), 
L= 0 
n 


€.9., HZn) = Ym 

5 (finite memory equal to 47’), memory span 
over 5 most recent data points. 

Then (7) becomes, 


= fy s(wéy,s(5) 
ae 


V=0 


or 


Ef s(W)ét5(5) 


Saeae i 


S'(2, 5) 


C2 ya (19) 


Using the tables of Anderson and Hauseman’® one obtains 
the following: 


6 Anderson and Hauseman, op. cit., p. 610. 
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THEORY Septembe 
V—- 1 2 
j U £1/(u) £5'(u) Cu 
’ 1 ay 42 0.085714 
2 —1 —l —0. 142857 
3 0 —2 —0.085714 
4 +1 —1 0.257143 
5 +2 +2 0.885714 
S'(V,5) 10 14 
ee 1 1 
Note 
flu) = 1, 
S/(Oon ele 
and 
Nance = il 
The output Z,, is given by 
5 
Zan = yy CLs ee: (20 
By the choice of parameters, 
E(Z,) = PGT) (21 
where P(f) is a quadratic in ¢t. Eq. (18) becomes, 
Liners = CHE eT: Vigo oe O1Ymn—4 ste OsYpes 
+ O1Y aii se 02Uan ae) te Bom) (22 
where 
Pn i Urn a Ym-3 aa Drs (23 
and 
~ ais 
6g) = pe Cet ) (24 
V=d V 
Therefore, 
Oy = CG; a KO's <= — 1.514285 
6. = Cs = (0.885714 
and 
L 
eee K 
b — pes K+u A ) 
Xu a ES wy (28 
therefore, 
6, = CF = 0.085714, 
b, = C, — 2C, = —0.314285. 
Finally from (14) one obtains, 
p = KMo) teat + a) _ 2g 5808) (5 


SC, n) S’(2, 5) 


From table n = 5’ one obtains, 


7 Ibid. 


~S’ (2,5) = 14 directly below the table labeled & and 
» (2,5) = 1, directly below S’ (2,5), so that 
B= 0.285714. 


tarting the Recursion Process 


After computing the constants as described above, 
1e still cannot start the computations of Z,,,2, [see (22)] 
nee one always requires two previous smoothed values 
m+i, and Z,,. One may use (20) to compute Z,, and also 


‘m+ [by advancing m by 1 in (20)] but this would involve 
nost of the labor which this method seeks to avoid. An 
Iternate procedure is to assume that y, = 0 for x < 0 
nd to start the process with m a large enough negative 
umber so that on the basis of the assumption, it is known 
at Z,,., and Z,, are zero. As an example in Table I one 
s computed, the value Z,,,. form = —3 to m = 5 
suming an input of the form, 
Y,= xu + 0.5, Tat yadey sce (27) 
= QO, Ea): 
in this case V(t) has been taken as zero. 
TABLE I 
Zms3 Ymi2 Error in Zmae dm 
0. 0. OF 0 
0.442857 0.5 0.057148 0 
1.45714 1.5 —0.042858 0. 
4.32857 4.5 —0.171431 0.5 
9.37142 9.5 —0.128576 2.0 
16.4999 16.5 OF 6.5 
25.4999 25.5 0 15.5 
36.5000 36.5 0. 30.5 
49 .5000 49.5 0 51.5 


_ Note that the error due to the discontinuity at the 
>rigin (« = 0) disappears as soon as n samples are obtained, 
ncluding the origin. This is due to the finite memory 
sharacteristic of the filter. The zero values previous to the 
memory span do not affect the results and one can con- 
tinue the calculations as if they were not introduced at 
all. The particular example chosen was poor with respect 
o demonstrating the virtues of the recursion formula 
ecause n was so small. To demonstrate that the com- 
»lexity of the method is determined by K (the degree of 
she polynomial) and not by n, another example will be 
bonsidered. 
| Let the parameters of the problem be taken as 


nm = 10 (memory span 10 samples), 

a = 1\ (prediction of derivative of the input at the 

tee | next sample point: E(Z,.+2) = P’ ((m + 8) 
T), 

K = 2 (quadratic input P(t) = ao + a, t + at’). 
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Then from (7), 


: EY, OE, eal 1) 
C= 297, 10) 


From Anderson and Hauseman® 


Ef io(u) = Ar, 10(U a LN, 


V=1 


ef solu) Jett = Diao = 2, 
E,10(U) = Av,1o((@ — 11/2)’ — 99/12), 
&- £1000) lenis = Aeno:(11) = 11/2." 
One requires the formula, 
Lincs —2Ln a — Ligit Oi nsd A OU mee 
+ B1Yn-0 + B2m-s + Bbms 


where 
b; = C, = 0.195455, 
b, = C, — 2C, = 0.35001, 
6, = Cy — 2Ci) = —0.483333, 
6. = Cio = 0.304545, 
and 


B a 2deaofS QD) 
S’(2, 10) 


This is no more complex a solution than when n = 5. As 
K increases by one, the number of constants 6 and 6 
increase by one. The number of previous values of Z,, 
increases by one also. The complexity of the solution 
increases linearly with increasing K. Table II represents 
a comparison of the computing machine requirements for 
the weighed average technique, (1), and the recursion 
formula, (18). 


= 0.041667. 


TABLE II 


COMPARISON OF CompuTING MAcHINE REQUIREMENTS FOR 
WEIGHTED AVERAGE FILTERING AND RECURSION FILTERING 


nm = number of samples 
K = degree of polynomial passed without error by filter 


Requirements | Weighted Average Recursion Formula 
Storage n coefficients 2K + [K/2]* coefficients 
n: data points K + 1 intermediate results 
Total: 2a n + K data points 
Totaln + 4K +1 + [K/2] 
Multiplication n 2K +1 + [K/2] 
Additions n— I 2(K + 1) + [K/2] 


*[K/2] = smallest integer in the fraction, e.g., [144] = 1. 


Again one may begin the solution by assuming, 
Yn 0, ee 0 


and say, for instance, 


8 Ibid., p. 598. 
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y, = « + 0.5 for x 


IIV 


0 and let m 


A339 St aioe 


One may verify by direct calculation that the transient 


error will exist for m = —2 tom = 6. 
For m = 7, e.g., when at least 10 data points from the 
curve, y, = v + 0.5, (« = 0), have been used in the 


formula, the transient error due to negative x, will not 
effect the current answers. 


INFORMATION THEORY September 


CONCLUSION 


A method has been presented for curve fitting to a 
polynomial in the Jeast squares sense. The method has 
the advantage of using recursion formulas and thus avoid- 
ing longer calculation when the number of data points 
is large. 

A method of starting the recursion process is suggested 
which has the advantage of simplicity of usage. A transient 
error is generated which however, disappears over the 
interval of one memory span of the filter. 


Locally Stationary Random Processes | 


RICHARD A. SILVERMANT 


Summary—A new kind of random process, the locally stationary 
random process, is defined, which includes the stationary random 
process as a special case. Numerous examples of locally stationary 
random processes are exhibited. By the generalized spectral density 
W(w, w’) of a random process is meant the two-dimensional Fourier 
transform of the covariance of the process; as is well known, in 
the case of stationary processes, V(w, w’) reduces to a positive mass 
distribution on the line w = w’ in the w, w’ plane, a fact which is the 
gist of the familiar Wiener-Khintchine relations. In the case of 
locally stationary random processes, a relation is found between 
the covariance and the spectral density which constitutes a natural 
generalization of the Wiener-Khintchine relations. 


INTRODUCTION 


N THIS PAPER we introduce the concept of a locally 
| stationary random process, which is a natural gen- 
eralization of the notion of a stationary random 
process. We shall show that there is a basic symmetry 
between the covariance and the spectral density of a 
locally stationary random process, and that the study of 
locally stationary random processes gives additional in- 
sight into the structure of stationary random processes. 
Before defining locally stationary random processes, we 
recall some relevant facts about random processes in 
general and stationary random processes in particular. 
For details, the reader is referred to the books of Loéve,* 
Blane-Lapierre and Fortet,” and Doob.* 
Let v(t) be a random process (in general complex), 7.e., 
a one-parameter family of random variables indexed by a 
real parameter ¢ lying in some index set 7’, which we shall 
take to be a closed interval, usually the infinite real line 


* Manuscript received by the PGIT, January 25, 1957. The 
research reported in this article was sponsored in part by Contract 
No. DA49-170-se—2253. 

+ New York University, Inst. of Math. Sci., New York, N. Y. 

1M. Loéve, “Probability Theory,’ D. Van Nostrand Co., Inc., 
New York, N. Y., ch. 10; 1955. 

2 A. Blanc-Lapierre and R. Fortet, ‘““Théorie des Fonctions Aléa- 
toires,’”’ Masson, Paris; 1953. 

3 J. L. Doob, “Stochastic Processes,” John Wiley & Sons, Inc., 
New York, N. Y.; 1953. 


Rk. We shall regard the parameter ¢ as the time, and x(¢) as 
some physical noise process evolving randomly in time. 
We shall assume that the second moment of x(t) exists 
for all t e 7, and that for all t ¢ 7’, the first moment of 
a(t) is zero;* the latter assumption involves no loss of gen- 
erality.” By the covariance of x(t) is meant the function 
of two variables 


T(t, v) = (a(Qa*(v’)), (1) 


defined on the product set 7 & 7; the angular parentheses 
denote the ensemble average, and the asterisk denotes 
the complex conjugate. It is apparent from (1) that 
IG, ) 1s Hermitian, 7v.¢., that IG 1) Cap ee 
function T'(, t’) on T’ X T is said to be of nonnegative 
definite type, 1f for every finite subset 7, of 7, and every 
complex-valued function h(t) 


DS VG, AO) 0; 


Paracel 


batreals 


where ¢ and ¢t’ separately range over all the points of T,, 
A fundamental theorem states that a function T(t, t’) or 
T X T is a covariance tf and only if it 1s of nonnegative 
definite type.° More specifically, if I(t, t’) is of non- 
negative definite type, one can always construct a Gaussiar 
random process which has I(t, ¢’) as its covariance. I 
should be noted that any positive constant is a covariance 

The random process x(t), defined for all real ¢, is saic 
to be statzonary in the wide sense, or to have a stationart 
covariance, if its covariance is of the form 


G5 t= Cale Ome bvtiteulen (2 


The quantity (| x |* ) is the average instantaneous powel 
of the process x(t), and is independent of the time. We 


‘The symbol e means is a member of; the product set T X T i 
the set of all pairs (¢, ¢’) with t, t’ e T. 

5M. Loéve, op. cit., p. 465. 

6 [bid., p. 466. 


an = 1. The fact that eT 1) is Hames pans that 
(7) = C*(—7). Suppose now that C(r) is continuous 
|tontinuity at the origin is sufficient) and that 


[ le@\ar< 


then it follows® that there is a continuous nonnegative 
unction ®(w), called the spectral density of the process 
(t), such that 


(2 CG) = [exp Gwe) de, (3) 


CY Uz f exp (—iw7)C(1) dr, (4) 
e., such that (| x |?)C(r) and ®(w) are Fourier transform 
hairs.” In particular, we have 
| 2%) dw = (2h). 

gs. (3) and (4), together with the stated requirements 
in the functions C(7) and ®(w), constitute one form of 
the famous Wiener-Khintchine relations. It can also be 
hown’’’"' that the process x(t) itself has the representation 


HOS i ; ReMANO (5) 


vhere ¢(w) is another random process, and where the 
integral is meant in the mean square sense.” The process 


(w) is one of orthogonal increments,’? which means in our 
base that 


(d#(w) dé*(w’)) = Ww) 6@ — w’) dw dw’, 


where 6(w — w’) is the Dirac delta function, and ®(w) is 
ihe spectral density of x(t) defined by (4). 

We need the following facts about the harmonic analysis 
bf nonstationary random processes. Suppose that x(t) is 
larmonizable,’* i.¢., that x(t) can be written in the form (5), 
vhere £(w) has a covariance of bounded variation on 


im X &. It follows that 


Wee! (| [exp Gt ate) | 
i] i ‘ exp (iw’t’) ae |") 


| 7The terms correlation function and stationary covariance will be 
ised interchangeably. 

8 J. L. Doob, op. ctt., p. 

| 9° If the requirement that Go be absolutely integrable is relaxed, 
3) and (4) are replaced by more general expressions (see J. L. Doob, 
ip. cit., p. 519). This requirement on C(r) corresponds to limiting 
‘he memory of the process x(t) in a way which is quite reasonable 
bhysically. 

10M. Loéve, op. cit., p. 483. 

J, L. Doob, op. cit., p. 527. 

| 2M. Loéve, op. cit., p. 472. 

| 3 Tbid., p. 479. 

4 Tbid., p. 474. 
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=| [| exp fier - 0't)] ae) ae), © 
a fact which is summarized by saying that I(é, ¢’) is 


harmonizable.'* We shall assume that any random process 
with which we are concerned is harmonizable, that 


i, / NTC Nea 0 ie <emco 
except in the stationary case, and that 
(dé(w) dé*(w’)) = Vw, w’) dw dw’, 


where W(w, w’) is a continuous function.’’ The function 
W(w, w’) will be called the generalized spectral density of 
the process x(t). Thus (6) becomes 


Pat le ie i exp [7@t — w/t’) VW, w’) dw dw’. (7) 


In particular, we have 


/ Hi Vw, w’) dw dw’ = 


The function W(w, w’) is a covariance by construction, 
and obeys the inversion formula 


TOF Oye 


vee ay 


Eqs. (7) and (8) are the natural generalizations of (3) 
and (4). As noted by Blanc-Lapierre and Fortet,’® in the 
stationary case V(w, w’) is a positive mass distribution 


concentrated on the line w = w’ in the w, w’ plane. Indeed, 
any generalized spectral density corresponding to a con- 


exp [—i(wt — w’t/)]T(t, ’) dt dt’. (8) 


tinuous mass distribution on the line w = w’ can be 
written 

Vw, wv’) = Bw)dw — w’) (9) 
where 


Pw) = il Vw, w’) dw’ > 0. 


It follows immediately, after substituting (9) in (7), that 
the covariances corresponding to distributions of the form 
(9) are stationary. 


6 Again, these assumptions are not the most general mathe- 
matically, but are quite appropriate physically. In particular, 
T(¢, t’) must be continuous and bounded. In the stationary case, 
I, ¢’) cannot be absolutely integrable in the two variables ¢ and ’; 
instead we require that it be integrable in the difference variable 
7 = ¢t — Uv. In the white noise case, we avoid infinite average in- 
stantaneous power by using the correlation function A(z) (see 
below), and corresponding to the discontinuity of A(7) at 7 = 0, 
we permit I'(é, t’) to be discontinuous when ¢ = ¢’. Similarly, in the 
stationary case, we permit Vw, w’) to be discontinuous when 
w» = w’, corresponding to the discontinuity at the origin of the 
Dirac delta function [see (9)]. 

16 A, Blanc-Lapierre and R. Fortet, op. cit., p. 382. 
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We shall say that the random process x(t), defined for 
all real t, is locally stationary in the wide sense, or has a 
locally stationary covariance, if its covariance can be 


written as 

| 
oy PES (tethe 
T(t, t’) = (| ol 2 
where again C(7) is a (normalized) stationary covariance. 
The symmetrized variable (¢ + t’)/2 has been chosen 
because it makes I(¢, ¢’) Hermitian and because of its 
suggestive meaning as the average or centroid of the 
points ¢ and ¢’. Qualitatively, a locally stationary random 
process has a covariance equal to a stationary covariance 
multiplied by a sliding power factor; this factor con- 
tinuously renormalizes the average instantaneous power 
to a representative local level, chosen as the power level 
at the centroid of the points ¢ and ¢’. For simplicity, we 
denote the functions in the right side of (10) by IT, and 
T., respectively. With this notation, (10) becomes 


) ; yew Sy eae e ro) 


ye : 


IB r,( ttc eh men el) 
and we rephrase our previous definition as follows: a 
locally stationary covariance ts a covariance which can be 
written in the form (11) where T, 7s a nonnegative function, 
and YT, is a stationary covariance. We note at once that if T, 
is a positive constant, then (11) reduces to a stationary 
covariance. This shows that our definition of local station- 
arity has the desirable property of including stationarity 
as a special case. We shall now exhibit less trivial examples 
of locally stationary random processes. 

We begin by considering the function A(t — ¢’), which 
is | when t = ¢’ and 0 otherwise. This function is obviously 
of nonnegative definite type on R X R, since if 7’, is any 
finite set of real numbers and h(é) is any complex-valued 
function, then 

~~ Ae — “)ah*’) = D7 | A) |? > O. 
Bend telen: teTn 
A(t — t’) is therefore a covariance, in particular a nor- 
malized stationary covariance. Moreover, the product 


fae : 
r,( 5 Jace ae Jy 


where I, is any nonnegative function (not necessarily a 
covariance), is also of nonnegative definite type and 
therefore a covariance, since 


= r(é + aC — #)h(t)h*(L) 


= STi) hon on 


teTn 


(12) 


17In this connection, see the definition of local homogeneity 
given by A. N. Kolmogoroff, “The local structure of turbulence in 
incompressible viscous fluid for very large Reynolds number,” 
Doklady Akad. Nauk SSSR, vol. 30, pp. 301-305; 1941. 
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Seplembe 


It follows that (12) is a locally stationary covariance. A 
process with a covariance of the form (12) will be called : 
locally stationary white noise. (The word white reflects the 
fact that A(r) is the symbolic Fourier transform of ¢ 
constant in the sense described in the Appendix.) 

The example just given shows that the product I, T. 
can be a covariance without both T, and T, being co 
variances. However, if both T, and T, are covariances 
the product I',P, is automatically a covariance.”* Ii 
follows that the product of a nonnegative covariance ©: 
the form I,[(¢ + ?¢’)/2] with a stationary covariance 
T.(¢ — t’) is a locally stationary covariance. Covariance: 
of the form I, [(¢ + t’)/2] have been studied by Loéve,”"™ 
who calls them exponentially convex covariances. As noted 
by Loéve, any two-sided Laplace transform of a non- 
negative function is an exponentially convex covariance 
The proof is immediate, for suppose that 

Ci) = / exp (—ar)p(a) da, Teel 
where p(a) is a nonnegative function. Then, for any 
finite subset 7’, of 7 


i ri! ‘ acon) 


t,t’2Tn 


= [ | SX exp (—at/2)A(0 |? pl@) da > 0, 
—o teTn 

so that T,[(t + ¢’/2)] is of nonnegative definite type, anc 
hence a covariance (on 7’ X T). Conversely, Loéve notes 
that any continuous exponentially convex covariance I’, (7) 
7 ¢ T, can be represented as the two-sided Laplace trans: 
form of a nonnegative function in the absolutely con- 
tinuous case. In particular, an exponentially conve; 
covariance is nonnegative. 

We can now exhibit a large class of locally stationary 
covariances by using the fact that the product of twe 
covariances is a covariance. In fact, let T,(7) be any 
Laplace transform of a nonnegative function which con: 
verges for all +r ¢ R, and let T.(7) be any correlatior 
function. Then the product 


t+ vt’ , 
r( 9 ine t’); 


is a locally stationary covariance, which we shall call ar 
exponentially convex locally stationary covariance. It car 
be seen at once that every exponentially convex covariance 
T',(7) defined for all + e¢ R is exponentially unboundec 
either at — © or +, and consequently has no Fourie 
transform (except for the case of a positive constant 
which has a symbolic Fourier transform proportional t 
the Dirac delta function). It follows that the same is tru 
of exponentially convex locally stationary covariances 


Ube eules 


18M. Loéve, op. cit., p. 468. 

19M. Loéve, “Fonctions aléatoires 4 décomposition orthogonal 
exponentielle,” Rev. Scz., vol. 84, pp. 159-162; 1946. 

20 M. Loéve, “Fonctions Aléatoires du Second Ordre,” in P. Lévy 
“Processus Stochastiques et Mouvement Brownien,”’ Gauthie1 
Villars, Paris; 1948. 
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owever, it is possible to convert suitable exponentially 
nvex locally stationary covariances into locally sta- 
pnary covariances with Fourier transforms by the 
lowing simple device. 

We begin by considering the equality 


p [—a(é + ¢”)] 


“exp [- ae = | 


the left side of (13) is obviously a covariance, since 


exp (ab Pt) |= exp(= 


2-0; (13) 


at’) exp (—at’’). 


the first factor in the right side of (13) is a nonnegative 
nction which is not itself a covariance, since it is of the 
rm T,[(f + ¢’)/2] and is not an exponentially convex 
povariance. The second factor in the right side of (13) is 
stationary covariance, since the Fourier transform of 
xp (—ar’/2) is a nonnegative function. It follows that 
13) is a locally stationary covariance, and in particular, 
e which is not the product of two covariances. 

| Now let x (t) be a random process with the exponentially 
‘onvex locally stationary covariance 


oes ; 
| r,( 9 rac ae ys 
| 
ind let y(t) be a new random process defined by 


y(t) = exp (—at’) x(i), 


Tsing (13) we can write the covariance of y(t) 


as 
T(t, ’) = exp | ~2a(! + “|e (4 +?’ 


em | — 20 = * [rae — 0, 


where I, is an exponentially convex covariance, and Tr, 
sa stationary covariance. The product of two nonnegative 
‘unctions is nonnegative, and the product of two stationary 
tovariances is a stationary covariance. It follows that the 
roduct of two locally stationary covariances is a locally 
tationary covariance. In particular, I,(¢, t’) is locally 
tationary. Suppose now that I, is absolutely integrable, 
ind that the function p(a) appearing in the definition of 
r is such that the integral 
Ae 
2a [ 


s finite.”’ Then it follows that (14) is absolutely integrable 
and has a Fourier transform. Thus, this device enables us 
o convert a large class of exponentially convex locally 
stationary covariances into locally stationary covariances 
vith Fourier transforms. 


T(t, ’) = 


@ SW, 


(14) 


| i Sch (ENG explo? /Saynonide 


| 21 This is the case, for example, for any a > 0, if p(a) vanishes 


utside a finite interval. 
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It is apparent from the foregoing that well-behaved 
locally stationary covariances exist in abundance. 


THE GENERALIZED WIENER-KHINTCHINE RELATIONS 


We now prove an interesting generalization of the 
Wiener-Khintchine relations for the case of locally sta- 
tionary random processes. 

Theorem: Jf I(t, t’) 7s a locally stationary covariance, 
then the generalized spectral density defined by (8) is also a 
locally stationary covariance.” 

Proof: Write (8) as 


vow) [no ae 
note ants ‘) |r, t’) dt dt’, 


and introduce the new variables u = (¢ + t’)/2 and 

= t — t’. Substituting from (11), we see at once that the 
function WV(w, w’), which is itself a covariance, can be 
written 


=1b — (15) 


w ++ 


Vo, 0) = u,( aw = (16) 
where 
V(w) = 7” ie exp (—zvw)T.(v) dv, (17) 
WA) = = iE ene Cine (18) 
and 
ri@ = is exp (iuz)W,(z) dz, (19) 
Tone / > xpiGe vy Coes (20) 


Since T, is a nonnegative function, WY is a correlation 
function, and since I, is a continuous correlation function, 
WV, is a nonnegative function. It follows that V(w, w’) is 
a locally stationary covariance, as asserted. We shall call 
(11), (16), and (17)-(20), together with the stated require- 
ments on the functions involved, the generalized Wiener- 
Khintchine relations. 

The case of stationary random processes is included in 
these relations by permitting I,(7) to be the positive 
constant I',(0) = (| x |’ ), so that the covariance (11) 
reduces to (2), its form in the stationary case. We also 
admit, as is customary, the symbolic relations 


— 
l| 


i * exp (en) a) dé, 
a (21) 


@) = 5- [exp (—i&n) dn, 


As remarked in the Introduction, we assume that I (Z, ¢’) is 
continuous (except in the white noise case) and absolutely integrable 
in ¢ and ¢’ (except in the stationary case). 
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where 6(&) is the Dirae delta function which, as shown in 
the Appendix, can be regarded as an unnormalized cor- 
relation function. It follows that in the stationary case, 
the spectral density V(w, w’) reduces to the special form 
[see (4) ] 


Vw, w’) = w)d@ — wo’), (9) 


2 (ote = J ate 2a) 
2 . 

where ®(w) is a nonnegative function; as already noted, 
(9) is tantamount to the usual Wiener-Khintchine rela- 
tions. The covariance (9) is obviously locally stationary 
and is, in fact, the covariance of a locally stationary 
white noise. (To see this, use the relation 6(€) = (1/e)A(&) 
discussed in the Appendix.) 

Other illustrations of the generalized Wiener-Khintchine 
relations are the following: 

1). Let TC, t’) be the locally stationary white noise 


Nee es 
rit) = 1 5 


Jaw =¢), 


where T, is any continuous and integrable nonnegative 
function. The corresponding spectral density is 
€ 
Vw, w’) = — Rw — ’), (22) 
20 
where «¢ is the “infinitesimal positive constant’ described 
in the Appendix, and where 


R@) = 5- [exp (—in) Pula) an. 


Since R(£) is a stationary covariance, it is obvious that 
(22) is a locally stationary covariance. 
2). Suppose I(t, t’) is the locally stationary covariance 


exp | -20(! + a exp |-2 (t — un | es (G89) 


Then the generalized spectral density is 
Sloe) 
al eepoaa Nemo 


“eXp |-2 (@ — wy | ; 


which is immediately seen to be locally stationary since, 
except for a positive factor, it is of the form (13), with a 
replaced by 1/4a. 

In the case of locally stationary random processes, 
there is a manifest symmetry between the form of the 
covariance I(t, ¢’) and that of the spectral density 
W(w, w’). The reason this symmetry is absent in the case of 
stationary random processes is apparent. The spectral 
density of a stationary random noise is itself stationary 
only when the noise is white. In general, the spectral 
density of a stationary noise is the covariance of a locally 
stationary white noise of the form (9), where the delta 
function appears as the Fourier transform of the constant 
average instantaneous power which characterizes a 
stationary process. 


Ara 


V(w, w’) 


(23) 
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CONCLUSION 


It is apparent that much remains to be done in th 
theory of locally stationary random processes. Perhaps thi 
most important problem is to investigate further th 
class of noise processes which are either locally stationary 
or can be approximated in some appropriate fashion by 
locally stationary random processes. Heuristically speak 
ing, if a process is such that its average instantaneou 
power is slowly varying with respect to its memory ©: 
correlation time, then it should be possible to approxi 
mate it by a locally stationary process. (A familiar exampl 
of such a process is the turbulent velocity field at large 
Reynolds number.*’) Numerous other problems in thi 
theory of locally stationary random processes are suggeste¢ 
by analogous problems in the theory of stationary pro 
cesses. 

The author wishes to acknowledge helpful suggestion 
by Dr. I. Kay and Professor J. E. Moyal. 


APPENDIX 


Some Properties of the Functions 6(€) and A(é) 


The Dirac delta function 6(€) can be regarded as th 
limit as e — 0 of the function 


wo = 10-18), 


5.(é) = 0, 


(Ehlesrens OF 


[eles e: 


Since 


sin” (ne/2) 
(ne/2)* 


6.(é)is the Fourier transform of a nonnegative function, anc 
hence it is a correlation function. It follows that 6(é) is : 
correlation function, since it is the limit of a sequence o 
correlation functions.’* (A simpler proof follows from th 
proportionality of 6(€) and A(é) described below, sine 
A(é) is a correlation function, as previously noted. 
Moreover, since 


Ne ieat é 
3) = 52 | exp (— it) dn, 


._ gin” (ne/2) 
Gea 


we have the symbolic relations 


= 1, for all n, 


e>0 


foo) 


1 iP exp (2En) 6(€) dé, 


l| 


(21 
6(€) 


Los : 
on i exp (— tn) dn. 


The function A(é), defined to be 1 when € = 0 and 
otherwise, can be regarded as the limit as « > 0 of th 
function 


A.) 


Il 


[al Sree. 


A.(é) [Elites 


It will be observed that A,.(é) = €6.(€). A(€) is a correl: 
tion function, since A(t — ¢’) is obviously of nonnegatiy 
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lefinite type (or, alternatively, since it is the limit of a 
quence of correlation functions). Symbolically, we can 
write 


I 
nm 
~ 

— 
It 
Sg 


A(é) 


I 

| 
> 
ae 


5(E) 


idjusts 6(0) = © to A(O) = 1. The symbolic meaning of 


Summary—tIn many of the applications of probability theory to 
roblems of estimation and detection of random functions an 
igenvalue integral equation of the type 


Te 
Sova [ Ka — yey) dy, 0<a<T, 


fis encountered where K(x) represents the covariance function of a 
continuous stationary second-order process possessing an abso- 
lutely continuous spectral density. 

In this paper an explicit operational solution is given for the 
‘eigenvalues and eigenfunctions in the special but practical case 
when the Fourier transform of K(x) is a rational function of w, i.e., 


HG) Gs = 


in which N (s?) and D(s?) are polynomials in s’. 
| It is easy to show by elementary methods that the solutions are 
(of the form! 


o(z) = D0 Ce*’* cos (8,2 + ¥,), 


r 


the constants C,, a,, 8,, and y, being linked together by the integral 
equation. It is precisely the labor involved in their determination 
‘that in practice often causes the problem to assume awesome 
proportions. By means of the results given herein, this labor is 
diminished to the irreducible minimum—the solving of a transcen- 
dental equation. 

_ * Manuscript received by the PGIT, February 4, 1957. The work 
reported in this memo was performed in connection with Contract 
No. AF-18(600)—1505 with the Office of Sci. Res. of the Air Res. 


and Dev. Command. 
+ Polytechnic Inst. of Brooklyn, Brooklyn 1, N. Y. a 
- 1Subject to the usual modifications if the roots are not distinct. 
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o) 


| exp Gena ae, 
- (24) 


A® = 5 [exp (— tte dn, 

obtained by formal multiplication of the relations (21) 
by € is apparent. A(é) can be regarded as the correlation 
function of stationary white noise with unit average 
instantaneous power, just as 6(€) can be regarded as the 
correlation function of stationary white noise with infinite 
average instantaneous power. 


The Solution of a Homogeneous Wiener-Hopf Integral 
Equation Occurring in the Expansion of 
Second-Order Stationary Random Functions’ 


D. C. YOULAT 


INTRODUCTION 


The integral equation 


o@ =r] Ke, yoy, OS2ST 


makes its appearance most naturally when one attempts 
to expand a random function n(¢, w) in an infinite series 


fo) 


n(t,w) = >) alwp(t), k= 


k=1 


O<t<T, 


(2) 


where the ¢,(¢) are orthonormal over 0 < ¢ < T and the 
a,(w) uncorrelated random variables. 
Denote ensemble average by FE { } and let 


mt) = E{n(t, w)} 
and 
K(s, t) = E{[ns, o) — m(s)][n@Z, @) — m(d)]}. 


Of course, K(s, ¢) is the familiar covariance function 
of the process n(t, w); if K(s, t) is continous in the square 
0<t<T7,0<s < T, the process is said to be second- 
order continuous in 0 < ¢ < 7, and if in addition K(s, t) = 
K(| s — t|), it is said to be second-order stationary. 

From a theorem of Karhunen’ any second-order random 


2K. Karhunen, “Uber Lineare Methoden in der Wahrschein- 
lichkeitsrechnung,” Ann. Acad. Sci. Fennicae, Helsinki, vol. 37; 1947. 
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funetion n(t, w) which is second-order continuous in 


0 <t< T may be expanded as follows: 


E(w) )pi(t) 
Vx 
with convergence in the mean for every ¢ in [0, 7]. The 
quantities \, and ¢,(¢) are determined from the integral 

equation ; 


CAN) 


m(t) + ss 


k=1 


(3) 


ais 
g(t) =r | K(s, Odils) ds, er) 
v0 
The ¢,(f) is orthonormal over 0 < ¢ < T and the H,(w) 
normalized uncorrelated random variables, 7.e., 


E} pues © } 


=O bee, 


* ble ) dt = b:;, 


0 


k,j = hes 


in which 6,; is the Kronecker delta. 
In particular if n(¢, w) is stationary, K(s, t) = K(|s — t]) 
and (4) becomes 


R= me f Kee iiacods. 


KS Wi RES Ota 


For Gaussian processes the /,(w) are actually independent 
normal variables and (3) converges in the ordinary sense 
with probability one. These properties of the expansion 
have been exploited to great ANUS in many theo- 
retical applications, for example,’ ° the results therein 
fully justifying the importance of (4) and (5). The inter- 
ested reader may find alternative treatments of (5) in 
Slepian,’? DeSobrina,’ and Muller.* As far as the author 
is aware, none of these treatments yield an explicit eigen- 
value equation. 


NovTAaTION AND ASSUMPTIONS 


If what follows, we restrict ourselves to a special class 
of stationary processes. It follows directly from a theorem 
of Bochner that if K(x, y) is the covariance function of a 
second-order stationary process with absolutely con- 
tinuous spectral density, then 


K@,)=Kla-y)=f e?Ge) dw, © 


3R.C. Davis, “The detectability of random signals in the presence 
of noise,” IRE Trans., vol. IT-3, pp. 52-62; March, 1954. 

4U. Grenander, “Stochastic processes and statistical inference,” 
Ark F. Mat., vol. 1, pp. 195-277; October, 1950. 

61). Slepian, ““Estimation of signal parameter in the presence of 
noise,’ IRE Trans., vol. IT-3, pp. 68-89; March, 1954. 

61). C. Youla, “The use of the method of maximum likelihood in 
estimating continuous modulated intelligence which has been 
corrupted by noise,” IRE Trans., vol. IT-8, pp. 90-105; March, 
1954. 

7R. DeSobrino, “Optimum signal detection with incompletely 
specified signal noise,’’ Ph. D. dissertation, Columbia Univ., New 
York, N. Y.; April, 1953. 

8#,. A, Muller, ‘‘Communication in the Presence of Additive 
Gaussian Noise,’ Tech. Rep. No. 244, M.I.T. Res. Lab. of Hlec- 
tronics; May, 1953. 
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where 
Gw) = G(—a), 


It is not difficult to show’ that the eigenfunctions gene- 
rated by such kernels by means of (5) are complete in 
0 < x < 7, and that the eigenvalues are positive, their 
sole possible limiting point being infinity. 

As is well-known if stationary noise with spectral den- 
sity G,(w) is passed through a linear filter whose frequency 
response is H(tw), then the output noise has spectral 


Giww) > 0. 


density G,(w) | H(iw) |’. For lumped constant systems 
2 m 
Hoe ho t hs + ssieate ; y+ Pons . 
Go ++ 9:8 Te G28 oe eta 


S = lo, iis, 6g real 


G,(o) =1 


For white input noise with mean square unity, 
and the spectral density G(w) of the output is 


N(-o!) _ N@). 
D(—o’) D(s’) 


N(s’) and D(s’) being polynomials of degrees m and n 


HO = liGavil— 


8) 


in 8”, v2z., 
MEETS Cine? (7) 
k=0 
D(s’) = »s Onis Do x 0. 
k=0 


The total output power equals 


eOUNT erie) ; 
ihe D(-w) 

In order that it be finite the degree of D(s”) must exceed 
that of N(s’) by at least two and D(s’) must have no 
roots on the imaginary s axis. Secondly, the requirement 
G(w) > 0, w real, implies that any purely imaginary zero 
of N(s*) must be of even multiplicity. 

The inverse of (s? — a’)~*, for k a positive integer and 
the real part of a > 01s 


(=). eo all (kj 2)! | v es 
(&— 1° > G=DIMC= 7! Cay 


By expanding N(s’)/D(s’) in partial fractions it is seen 
that K(| « — y |) must be the sum of products of poly- 
nomials in | « — y | with exponentials of | x — y |. 
Denote the roots of D(s’) by + w4, + me, --: , E bp 
and assume them so ordered that 0 < Rew, < Re uw, <, 
-- , <S Re yz,. Define the polynomials D*(s) and D (s) 
by D(s*) = D*(s) D°(s), D*(—s) = D*(s), the roots of 
D*(s) being —u; and those of D’(s), + u;, j = 1, 2, 
n. Symbolically, 


Dy =) dss: d, real. 
k=0 
°D. C. Youla, “A Finite-Time Homogeneous Weiner Hopf 


Integral Equation,” Tech. Rep. No. 367, Microwave Res. Inst., 
Polytechnic Inst. of Brooklyn, N.Y.; : October, 1955. 
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The following properties of K(a) may be derived with- 
put too much effort from its representation in (6) and the 
act that its transform is a rational function of w pos- 
essing the properties enumerated above. 


ny, The highest order derivative possessed by K(x) 
at the origin is 2n — 2m — 2. 


2) p'(“)ice = (0; beac: 


ri aee 
3) D (“)icay = 0), ee <4()): 


| c) (2) = : 

4) De Kis) -==-0; “0. 

5) K(x) possesses continuous derivatives of all 
order for x ¥ 0. 


(tae DETERMINATION OF THE HIGENVALUES AND EIGEN- 
FUNCTIONS GENERATED BY A KERNEL K(\z—y|) WHOSE 
TRANSFORM IS A RatrionaL FuNcTION oF 8s” 


We now direct our attention exclusively to the integral 
quation 


iB 
o@) =| Kew — dew) dy, (8) 
Ota sd 
\where 
K@) = a : —Rep, < Res < Rey, 


hind N(s°) and D(s’) are as described in (6) and (7). The 
motation is that employed by van der Pol.'° From the 
‘breakup 


@) =r | K@ — pow) ay re 
+r | K@ — poy ay 


lof (8) and the continuity properties of K(x), it follows 
that (a) possesses continuous derivatives of all orders 
for 0 <a < T. Again by the continuity properties of 
K(x) and its derivatives, it follows that the function 
ii 

ya) = [| K@- yew dy, -2 <a<e, 
where ¢(x) is a solution of (8), possesses continuous 
‘derivatives of all orders everywhere except perhaps at 
le = O and « = T;forx < Oor2z > T these derivatives 
are given by 


d'g ao Ty) a 
oo) Bae CO 


2 et 


10 B, van der Pol and H. Bremmer, ‘Operational Calculus Based 
on the Two Sided Laplace Transform,” Cambridge University Press, 
Cambridge, Eng ; 1955. 
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Lastly, 
ena), 
(r= 0) 1) oolgerek ie Oar 
Define &(a7) by 
P(x) = o(@), Once, 
= 0 otherwise 
and let 
v(x) = P(x) — dg(a), 
= 90) rf K@- HW dy, 00 


CO ay OIE CO RR 


From the previous discussion, the properties 2) and 3) of 
K(x) and the fact that ¢(x) is a solution of (8), it follows 
that 


(od 4 

1) D (foc) = ();, oe 

2) p-(£)@ = 0 Rok 
7h) = 9; x ‘ 

3) v2) = 0, Oa an 


4) The left and right hand limits of v(x) as « > 0, 
T, exist and are finite. 


As an immediate consequence of these properties, it 
follows that V(s), the bilateral Laplace transform of v(a) 
is given by 

P(s) 


Vis) = Done e 


-sr Q(s) 
D*(s)’ 


Re ty = We shahience 


(11) 


P(s) and Q(s) being polynomials of degree n — 1 at most. 
The proof is obvious once it is realized that v(v) is a sum 
of exponentials in both the regions « > T and x < 0. 

Denote the bilateral Laplace transform of ®(2) by ®(s). 
Transforming both sides of (10) with respect to a (a 
common strip of convergence exists) yields 


Vee 9] 1 = wey 
— Rep, < Res < Rew. 
Using (11), 


D*(s)P(s) — e °"D (s)Q(s) 
D(s’) — AN(s°) ; 


G(s) = (12) 


—Rem < Res < Rey. 


However, since ®(s) is an integral function of s, 


> vi 
(#2 = -| xo(xje ** de), 
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(12) is valid in the entire strip —° < Res < © and 


numerator and denominator must possess the same 
order zeros. 

In (12) we have an explicit representation for the trans- 
form of an eigenfunction. It remains to establish the 


connection between P(s) and Q(s). 


Let the roots of D(s’, 4) = D(s*) — \N(s”) be + @,(A), 
+ w.(A), +, --- , £w,(dA) arranged so that O < Re a, < 
Rew, <, +--+: , < Rew,. In what follows we shall, in order 


to minimize the algebra, assume that \ is an eigenvalue for 
which the w,(A) are distinct. In any case there exist at 
most a finite number of eigenvalues which violate this 
assumption and they are easily determined in any given 
problem. 

By equating the numerator of (12) to zero for s t= 
w,(A), + w(A), +, «+: , + w,(A) we get the n pairs of 


equations 
(hy = = Wo jy = Oe 
jy = Ob Po 92 
m=tVGE— 40% p= + VG — 40% 


D'(o,)P(,) = &°"7D-(w,)Q@,), 
D-,)P(—w,) = e°"7D*(@,)Q(—«,), 


(13) 


r= 1,2,+++,n. 


Multiplication of corresponding sides of (13) yields [any 
factor common to both D(s’) and N(s’) is assumed to 
have been cancelled], 


Pw,)P(—@,) = Q@,)Q(—-e,), 


rp = 1,2,--- ,n. 


(14) 


The polynomial L(s’) = P(s) P(—s) — Q(s) Q(—s) is of 
degree n — 1 ins’ and vanishes for n distinct values of s’, 
namely w;, 5, --: , w,. Hence 


P(s)P(—s) = Q(s)Q(—9). 


Let P(s) = p,-1 (8 — a)(s — a2) «-+ (8 — a2). Then 
any solution Q(s) of (15) is of the form (41, jo, -+* , jn—1 18 
any permutation of 1, 2, --- ,m — 1) 


(15) 


()(s) = Dyan I] (s i) a;,) I] (s + Oey 


ORS 7 ae 


(16) 


and are 2” in number. But from (13) it is seen that the 
coefficients q,.(k = 0,1, --- , — 1) of Q(s) must depend 
linearly on the coefficients p,(k = 0,1, --- ,2 — 1) of P(s). 
This allows us to rule out all but four solutions of (16), 
namely, P(s) = + Q(s) and P(s) = + Q(—s). 
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The following example should make this clear. Suppose 


P(s) = Po + DiS + Dos 
and 
Q(s) = go + ms + as. 
Then 
P(s)P(—s) = po + (2pop. — pis + p28 
and 
Q(9)Q(—s) = 46 + (2g — gi)s + 25°, 
so that by (15) 
Po = Go D2 = Ga; 
QPP. — Pi = 202 — G 


The solutions are 


jo), = Ue {oy —=* Of 
10x = Ub P2 = Qe 
Pre= EU jo = Salhi 


To prove that P(s) = + Q(s) and P(s) = + Q(—s) are 
the only linear solutions of (16), note first that they corre- 
spond tor = n — 1 andr = O, respectively; the others 
correspond to 0 <r <n — 1. The coefficient of s”* in 
Q(s) is, for any r, given by 


Gn-2 = Fpn—1(@;, +tn@wtioue % 


d a a;, ; Oj ets 


=) nso ee aoe Oe 
Clearly if r ¥ 0 orn — 1, g,-2 is not a symmetric function 
of the roots a1, a, ++: , a, of P(s) and this in turn 
implies, a fortiorz, that it cannot be a linear combination 
of the p,. 

The two pairs of solutions P(s) = O(— 8) Ps) 
+ @(s), coincide when m = 1. A glance at (13) reveals that 
the former pair is consistent with both the upper and 
lower sets of equations and that for n > 1 the latter pair 
is not. Thus the only possible relations between the poly- 
nomials P and Q are P(s) = + Q(—s). Substituting for 
Q(s) in the top set in (13) we get, 


~ 


n— 


[I F (—]"2,lorm = 0, (17) 


> 
ll 
Oo 


where 


L,= 1) 2s 


—wrT Ben | 

| Bee ee 
In order that a nontrivial solution for the p, exist, the 
determinant of the system must equal zero. Hence, 


P57 
al =n a), (ges Xy)@ 


(1 =e eo) (1 + 42) We 


1+a,), G+ 2,)o, 


hese two transcendental equations serve to determine 
he eigenvalues which are contained implicitly in the 
*s and w’s. 

| When n = 1, (18) reduces to x, = + 1, or 


(19) 


eorem reveals that for any n > 0 all solutions of 
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(18) 


ee) eons 


For n > 1 (18) may be written in a more manageable 
form as (we choose n = 3 to avoid cumbersome deter- 


tanh 6, 


7 ae D*(s) 
oe 


(20) 


jie on the purely imaginary s axis. Denote them by 
it 18,(k = 0,1, --- , ---). After a little algebra we find 
hat the roots of (20) may be determined from the two 
:ranscendentals 


tan pr _ _4ap — 4,8" eG Bee 
= ; } a 
24 ae > dB + d.8 aa 21) 
eee _ ob — OB Seid Se 
2) dope edb 
ow when 
n = 1, NG =a, > 0, 
GP = abpe— cs Done 0; 
= , Qo ads bo 
a C ( 2), (21) 
| D*(s) = Vbo + Ves 
rnd reduce to 
tan — ae 
nd . a 
(LO ae 
ctn 5 = ie B 
The , are calculated from 
2 


(do = dow, = oe -) iw, dsw, ge i: -) tanh (w,T'/2) ; 


minants) 
1 «coun, w; tanh 6, 
N= ll @, Con @, oy tan hdy | 208 (24) 
1 ws; coth 6 ws; tanh 6, 
where 
ed i (25) 


The other equation is obtained by interchanging tanh 6, 
and coth 6, in (24). 

Once the eigenvalues \, have been determined, the p, 
are found from (17) and through them ¢,(x). Explicitly, 


P(s, d,)D'(s) 
D(s’) — »,N(s") ’ 


(x) = (26) 


Res > o,(,), 
O< #< 


The ¢, defined by (26) are orthogonal but not orthonor- 
mal over 0 < x < T. An expression for the normalizing 
constant can probably be derived but its undoubted 
complexity makes it useless for practical work. 


THe Semi-InFINITE RANGE SINGULAR 
INTEGRAL EQUATION 


The solution of the homogeneous integral equation 


6a) =» [ K@- Ded, O<2<> 


has been given by Wiener and Hopf for a wide class of 
kernels and can be found reproduced in Titchmarch’s 
book.” However, when the transform of K(z) is a rational 
function of s* our elementary technique gives the solution 
immediately. Omitting the details and adhering to the 
same notation as before, we find that 


uF. C. Titchmarsh, “Introduction to the Theory of Fourier 
Integrals,’’ Clarendon Press, Oxford, Eng., 2nd ed.; 1948. 
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bon(d,Da—3 ae OC  apaas Ale naib) mo (Don—2 =. AA sia) dnjPn—1 


¢'"(0", ») = 


(SD AS) as 
~ D(s’) — AN(s”)’ 


Res > Rew, (A), 


(s) (28) 


the polynomial P(s) being arbitrary and of degree n — 1. 
The eigenvalues are determined by the requirement that 
the integral in (29) converge. This leads to 


Re w,(A) < Re wy. (29) 


All X satisfying (29) are eigenvalues. 

The coefficients p, of P(s) serve to fix the initial values 
$(0°, ), ¢’ (0°, d), --- , 6” (O"," A). Since the number 
of linearly independent polynomials P(s) is n, each eigen- 
value has n-fold degeneracy. To give a formula for ¢” 
(07, r), @ = 0,1, --+ , %— 1) in terms of the 9;, az, and 
b.,, 1t is first necessary to introduce some preliminary 
notation. Let 


2n—1 


P(s)D*(s) = Oy A,s° 


and 
DV aN Ni(sou = as, 
k=0 
where 
k 
A; = SS Di=; d; 
7=0 
and 
Ba, as box — NAg,. 
It is understood that A,, = 0 for k > m and any p or d 
whose subscript exceeds n — 1 and n, respectively, is to 


be taken equal to zero. Then (again omitting details), 


Q’ +3 


¢'(0",») = (HID! Gas PiU eee ia ia es (30) 
2n 
where 
0 1 0 
GE Ss).0 0 Boras 
Apes Bon-2 0 (31) 
0 Jee 0) 0) 
OF = 1.0 0 bee Oieiteuce 
| Ae Bon-2 0 Bon 
For example, 
+ d, n— 
o(0*,d) = Pe, 
6'(0*, d) = OR ee + Ce (32) 


Don : 


2 
Don 


ete. 


ILLUSTRATIVE EXAMPLES 


Consider the Picard kernel 


2 
K@) = ve" = roe aoe ey ey 
—k <Res<k. 
Here a = 2k8’, bo = k, and c = I. Fromi(@2)sanga 
we get 
tan ee = —£/k, 
2 
etn peey B/k, 
2 
_k+ 8: 
Ar = Ohe? 
and 


$(x) = zsin 6,7 + cos 6,2. 


The ¢,(x) is not normalized. 
A more complicated specimen is the kernel 


2 ee 


V2 v2 
BONS ee oO ed 
D@=8 = V2s— 


_ : 1 
K(x) = ua 


and 
D(s’) — \N(s’) = 8&* — (A — J). 
Let 
tos(N) = (NE 1)F ee 
and 


w3(A) = te 


where Im. « < 0 < Re. ¢«. From (24), the eigenvalue 
equations are found to be 


> coth 4, 


w, coth 6,, 
and 
w, tanh 6, = w, tanh 6, 
in which 
tanh oe (1 + @}) tanh (7/2) + V Qe, 
(1 +2) + 4/20) tanh @17/2) 


(1 + o) tanh »T/2) + Ve». 
(1 + 3) + V2w, tanh (7/2) 


tanh @, = 
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| ae w, and w, by their values in terms of ¢ they ae Wk fi re S cea ated | 


r r ,~ 7 even, 


i ie 
iV 2¢ + (1+ ) tanh (€7'/2) iersaene: 
+e) + 0/2 tanh (1/2) Veen = ie 
ie il 


4 We = (1 — &) tan (eT/2) 


If this same kernel is used over the semi-infinite range 


fel =) — W/% tan (€T'/2) (33) the eigenfunctions are most simple, v7z., 
| . P(s\(s? + Neal 
|| Denote their solutions by «, 6, --- , --- , the odd $a, d) = s +=») ; ORS See 
jtbscripts corresponding to the plus sign. Then, \, = €," ; 
Ly es 1.2 es Sag from an) - “To find the eigenvalues set \ — 1 = r* e’”’. Then, w.(d) 


= re“ < p < 2m) and from (29) r cos (p/4) < 1/2; 

a point (7, p) in this region determines the eigenvalue 

Dias = tanh 6,(e,), r odd Mr, p) = 1 + r* e’. Hence the strip of covergence of the 

i above transform is Re s > r cos (p/4). Note that now the 

eigenvalues are no longer discrete but form a continuum. 

re eee 6,(e,), r even This is not surprising since K is not square-integrable 

& over 0 < 2, y < o and, consequently, the theory of 
bounded linear symmetric operators is inapplicable. 
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The Correlation Function of 
Smoothly Limited Gaussian Noise’ 


R. F. BAUMt 
Summary—tThe correlation function of ‘‘smoothly” limited e) =u fof ee eee 
faussian noise is calculated and compared with the correlation y(o) m2) 
inction of ‘‘extremely” clipped Gaussian noise. The limiting == ays for x > Xo 
nction is assumed to have the shape of the error integral curve. 
he output spectrum is calculated for the case of noise passed = —2 for NPE 


shrough an RC filter. and for varing ratios of rms noise to clipping threshold ao. 


As this ratio approaches infinity the case of extreme 
clipping is reached. 

Instead of clipping suddenly, when x exceeds the thres- 
NHE effect of clipping on Gaussian noise has been hold a, smooth or gradual limiting may be preferred. 
investigated in detail." Van Vleck has analyzed The limiter characteristic then will have the aspect of 
the case of an amplitude limiter with an output Fig. 1(a). Van Vleck has calculated the autocorrelation 
s input characteristic y(x) described by function of the output noise for a limiter of the shape 


I. InTRODUCTION 


| 
i 


| S =il 
| * Manuscript received by the PGIT, March 13, 1957. y(x) =¢ tanh (@2). 

_ + The W. L. Maxson Corporation, New York, N. Y. a ; 

1. H. Van Vleck, “The spectrum of clipped noise,” RRL Rep. Unfortunately the result cannot be expressed in closed 


f; Jul Pall 1943. +5 : sae g s nd Battin.’ 
Seed Lawson and G. E. Uhlenbeck, “Threshold Signals,” form. Another example is discussed oe Laning a 
eGraw Hill Book Co., Inc., New York, N-Y., p. 58; 1950. The present paper assumes a limiter of the form of an 


die del. Laning and R. H. ‘Battin, “Random Processes in Auto- ae Thi rve has considerable practical 
matic Control,” McGraw-Hill Book Co., Inc., New York, N. Y., error-integral curve. needa P : 


b. 163; 1956. importance because: 
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(a) 


y? output 


KF wput 


(b) 
Fig. 1—Limiter curve. (a) oo; # 0; (b) oo; = 0. 


(a) The probability density distribution of the output 
noise can be made constant over a prescribed range 
of output voltage. The spikes of the distribution 
curve, at the threshold level of which accompany 
sudden clipping, disappear. The output voltage 
spends the same average time at any voltage level 
within the range. 


(b) Therefore, if the output noise is used to fm modu- 
late an rf carrier, the output power can be evenly 
distributed over a sharply defined frequency band. 

The output correlation function appears mathematically 
in closed form, which facilitates the calculation of the 
output spectrum for a prescribed input spectrum. As an 
example, the output spectrum for Gaussian noise passed 
through an RC filter is found and compared with the spec- 


trum of “extremely” clipped noise. 


II. CorRELATION FUNCTION AND SpEecTRUM AFTER 
SmoorH LIMITING 


The smooth clipping function is defined by 


eee ‘ie —2z7/200 
= € “de 
=k Deane 9 


where x is the input voltage and K is a constant. Since 
this function is proportional to the integral of a nor- 
malized Gaussian probability curve of rms value ao,, this 
parameter will be called rms value of the limiter curve, 
as distinguished from the rms value a, of the input noise. 


(1) 


ACTIONS ON INFORMATION THEORY 


Seplembe: 


The parameter 
Tos 
n= ANS: (2 


then defines the ratio of the two rms values. 

The shape of the limiting curve is indicated in Fig. 1 (a) 
The curve limits the noise to values between plus anc 
minus 1/2K. The special case of extreme clipping showr 
in Fig. 1(b) is obtained from the result as o), approaches 
zero. 

The particular choice of this function is explained by 
the fact that for a = 1 the input probability distributior 
P,;, [Fig. 2(a)] is converted to the output probability 
distribution P,,, [Fig. 2(b)], which is constant betweer 
the values + 1/2K. 


+ ?, ut 
K 
aol = uA 
aK 2K 
(b) 
Ter 
ame 
a ba 2% 
(c) 


Fig. 2—Probability distributions. (a) Gaussian (input), (b) smoothly 
limited (a = 1), (c) extremely clipped (a = 0). 


The output probability distribution can easily be 
established by application of the well-known formul: 


Nees 
Peat on dy/dx (3 
by using the derivative of y from (1) and with 
1 ; 
PP, = See 4 
V 210, ( 


1957 


t2K 
5 \ = 


Fig. 3—Probability distribution for varying input levels. 


‘This results in 


Pour(2) =a KN a eo (22/240) (amd) (5) 


Since x and y are related by (1), this also gives P,,, as 
bh function of y. In Fig. 3, P.y./K is plotted as a function 
pf 2Ky with different values of a as parameter. 

It is interesting to note that at 2Ky = 1, Pou, remains 
vero for a < 1 but develops an infinite peak for a > 1. 
As a — O (extreme clipping) only these peaks remain 
Fig. 2(c)] since the output noise can have no other values 
but y = + 1/2K. Furthermore, as (5) shows, fory = x = 0 


P(0) 
Sie Va. (6) 


f the normalized input correlation function p, is given, 
the normalized output correlation function P, can be 
found. The details of the calculation are given in Appendix 
IA. The result is 


l+a 
In particular for a — 0 (extreme clipping) one obtains 


Pee cine op: (8) 
Tv 

| which checks with the value obtained in Van Vleck.’ 
Also, for a = 1 (constant output probability distribution) : 


P= Oein iG (9) 
T 2 


Baum: The Correlation Function of Smoothly Limited Gausisan Noise 


195 


For a-— © (amplification) we obtain the obvious result 
ie 

The output spectrum is the cos transform of the auto- 
correlation function: 


4 S 

[oS rezk ( 1 pt sin’ (; a) coswrdr. (10) 
sin 1 aay 0 

An evaluation of S(w) for the case a = 1 is given in 


Appendix B for Gaussian noise passed through RC 
filter of bandwidth B, for which 


(11) 


One may expect from intuition that the harmonic 
content of smoothly limited noise is far less than that of 
extremely clipped noise. In general the output spectrum 
can be thought of as resulting from the superposition of 
an infinite number of spectra extending individually from 
zero to B, 3B, 5B, --- etc., as shown in Fig. 4. 


Sw) 


Fig. 4—Output spectrum after limiting. 


Table I compares the amplitude of the first three spectra 
for smoothly limited and extremely clipped Gaussian 
noise. The amplitude of the basic output spectrum of 
bandwidth B is 3/m and 2/z, respectively, as shown in the 
first column. The last column gives the sum of the first 
three spectra. This equals, for a = 1, almost one. There- 
fore the power outside the band B is quite small. 

Finally it is pointed out that (7) can also be used to 
calculate the input correlation function p,, if an output 


TABLE I 
B 3B 5B Sum 
Smoothly 
Limited 0.9549 | 0.0398 | 0.0045 | 0.9993 
a=1 
Extremely 
Clipped 0.6366 | 0.1571 0.0478 | 0.8415 
a=0 
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correlation function P, is prescribed. For instance, it 
asked if it is possible to completely eliminate 
the spread of the spectrum by prescribing an output 
correlation function of the form of (11). The answer is 
negative, since the calculation leads to negative terms in 
the expression for the necessary input spectrum. 


could be 


APPENDIX 


A. Evaluation of the Normalized Output Autocorrelation 
Function 


If the input autocorrelation function g, is given, the 
output autocorrelation function of limited Gaussian 
noise is given by the integral* 


= ll F(ju) ‘| Kiger ee ee" aan (12) 
dor C tal 


where F(ju) is determined by the limiting function y(a): 


a 2) = al F(ju)e’** du. (13) 
Pais Aes 
If we use the imaginary axis for the contour C, (ju) 
becomes the Fourier transform of y(). 
In the following analysis we shall substitute for the 
limiting function y(w), the limiting function y*(x): 


yX(a) = ya) + 55 


the additional constant is obviously chosen so as to make 
y* > 0as2— —~o. The Fourier transform of y*(x) is’ 


—(G08/2)u? 


que 


F(ju) = (14) 
The use of y* instead of y will merely add a de voltage to 
the time varying output voltage, which otherwise main- 
tains its shape. Therefore, the output correlation function 
will contain a constant term, due to the de voltage, and 
a time dependent term, due to noise, which is the same 
for y*(x) and y(x). As the calculation shows, the constant 
term appears in the form of an integration constant (21), 
which therefore has to be disregarded. Because of the sym- 
metrical limiter curve the input noise by itself cannot 
account for a de component in the output. 

By substitution of (14) into (12) the integral becomes 


1 i iE 1 [(got+¢05)/2] (u? +0?) —uv 
mee ° os a Or 
FS == 65 dudv. (15 
An’ K? J_. Joo WD (15) 


We may eliminate the factor 1/w by first calculating the 
derivative of y, with respect to g, and simplify further 
by introducing the new variables: 


Y= 


, _ _ |%o + Gos 


2 


48. O. Rice, “The mathematical analysis of random noise,” Bell 
Sys. Meche dis vol. 23, pp. 282-332; 1944; vol. 24, pp. 46-156; 1945. 

5 Gabe Campbell and R. M. Foster, “Fourier Integrals for 
Practical Applications,” Bell Sys. Monograph B-584, 1942. Pair 
726.2 and 210. 
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This results in: 


dy, Le 2 
do, Ag K? a5 ++ oe 


co co 
/ / Pm od Meat ake LAL SAN du’ dv’. 
—-O Jv —-Oo 


This integral can be evaluated by a method indicated 
by Rice by introducing two new variables y, and Y 
related to w’ and v’ by 


(16) 


Ua ae aw y 
: V1 0s ; 
v’ : Yy 
V1 eae ; 
where 
Gr 
ae 
om) ain 0s 
Since 
il 


du’ dv’ 


== dy, dy2 
Wale) eee 


and since the limits stay the same, the integral becomes 


dy, 1 1 1 


dp, 2rK? o, + Cos Ai Tee a2 
a ee (17) 
fo [co ans eu 
The double integral equals 7 and we obtain 
dy, 1 1 rl (18) 


de, IK’ So + os ea 
“Ga 
do Be TOs 


By integration follows, with C as integration constant 


Aan 1 ERE | ( G; ) ry 
SiS on kK? Sim eee + C. (19) 
The normalized input correlation function is 
p, = = (20) 
on) 


and with the parameter a = o9,/ao [as in (2)] ¥, becomes 


OEE iil Pr 
sin (2) 


Mase or ort (21) 
p, equals one for r = O, therefore 
sin" ( 4 ) 
l+ea 
Vi Stormo ge 


and the normalized output correlation function becomes 


ba 


eu 


(23) 


Sn - 
sin cee 


hich is our final result. 
That the integration constant C in (21) is zero may be 


i 


(24) 


}since yp» represents the output rms value it may also be 
bbtained by direct integration of the output probability 
distribution 


+1/2K 


Ky dy = 


SOA 


(25) 


1 
es! 12K 


By comparison with (24) we find that indeed C must 
(em zero, as postulated in the previous discussion. 


B. Evaluation of Output Spectrum for Gaussian Noise 
Passed Through an RC Filter 


The output spectrum is given by (10). 


4 ix. -1 Pr ) 
ae i ‘| sin (. i cos (wr) ar. 
l+a 


(26) 


Summary—tIn this paper we wish to show that the fundamental 
‘problem of determining the utility of a communication channel in 
conveying information can be interpreted as a problem within the 
ramework of multistage decision processes of stochastic type, and 
4as such may be treated by means of the theory of dynamic pro- 
¢gramming. 

We shall begin by formulating some aspects of the general 
problem in terms of multistage decision processes, with brief 
descriptions of stochastic allocation processes and learning pro- 
cesses. Following this, as a simple example of the applicability of 
the techniques of dynamic programming, we shall discuss in detail 


Kelly that under certain conditions, the rate of transmission, as 
defined by Shannon, can be obtained from a certain multistage 
decision process with an economic criterion. Here we shall complete 
Kelly’s analysis in some essential points, using functional equation 
techniques, and considerably extend his results. 


* Manuscript received by the PGIT, February 19, 1957. 
+ The RAND Corporation, Santa Monica, Calif. 
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a problem posed recently by Kelly. In this paper, it is shown by | 


19%, 


The solution is found by expansion of the sin * term in 
the integrand into a Taylor series and integrating term 
by term. 

In particular for a = 1 


een dle: pr mae) 3 ( 7 
he (2) = & 45 (2 \ 5i4 Vou) ee ee 


The series converges rapidly due to the factor (1/2) with 
which p, appears multiplied. Taking only the first term 
we obtain 


Siw) = 24 i p, COS (wr) dt = : Sin(@) 
0 


for any input spectrum S,,(w). 
For p, according to (11) the output spectrum becomes 


Saye 34 if Mes 0.0417 : 
ALG G+) 
Late ep 3B 
i Oe AGE 
Ls (4) 


as compared with the normalized input spectrum 


4 i 
Su@= 5 eee 
uae () 


The factor 3/m equals 0.9549. 


On the Role of Dynamic Programming in 
Statistical Communication Theory’ 


R. BELLMAN7 anv R. KALABAT 


I. InrTRODUCTION 


N THIS PAPER we wish to show that the funda- 
| mental problem of determining the utility of a 

communication channel in conveying information 
can be interpreted as a problem within the framework of 
multistage decision processes of stochastic type, and as 
such may be treated by means of the theory of dynamic 
programming.’ ~* 


”) 


Princeton University 


Bull. 


tR. Bellman, “Dynamic Programming, 
Press, Princeton, N. J.; 1957. 

2R. Bellman, “The theory of dynamic programming,” 
Amer. Math. Soc., vol. 60, pp. 503-515; November, 1954. 

3R. Bellman, “Some functional equations in the theory of dy- 
namic programming, I. Functions of points and point transforma- 
tions,” Trans. Amer. Math. Soc., vol. 80, pp. 51-71; September, 1955. 
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This paper is to be envisaged as a step in the direction 
of a broad theory of communication, as contemplated by 
Wiener in his recent article,’ and following the pioneering 
efforts of Rice,’ and Shannon.” Among other steps along 
this path, we would like to cite the recent articles of 
Bussgang and Middleton,’ and Middleton and Van Meter,” 
which employ the modern theory of statistical decision 
functions and sequential analysis, due to Wald.” 

We shall begin by formulating some aspects of the 
general problem in terms of multistage decision processes, 
with brief descriptions of stochastic allocation processes 
and learning processes. I’ollowing this, as a simple example 
of the applicability of the techniques of dynamic program- 
ming, we shall discuss in detail a problem posed recently 
by Kelly.’° In this paper, it is shown by Kelly that under 
certain conditions, the rate of transmission, as defined by 
Shannon” can be obtained from a certain multistage 
decision process with an economic criterion. Here we 
shall complete Kelly’s analysis in some essential points, 
using functional equation techniques, and considerably 
extend his results. 

In addition to the original problem of Kelly, we shall 
consider a time-dependent case, a process involving 
correlated signals, and a multisignal case, in both discrete 
and continuous versions. It will be seen that the logarith- 
mic criterion function plays an extremely important role, 
as its special functional properties permit us to obtain 
explicit representations for both maximum return and 
optimal policy. . 

It is important to observe that the method we employ 
is equally applicable to criterion functions of varied 
analytic form. In any case it leads to a simple computa- 
tional solution of the associated maximization problem. 
Furthermore, it is applicable to processes where the law 
of large numbers is not unrestrictedly applicable, such 
as those of finite duration, those with correlation, and so 
forth. A decision process automatically introduces a 
correlation between successive events. 

Finally, we discuss briefly a functional equation arising 
from the general question of defining the “‘value”’ of a 
communication channel in a fashion which is independent 
of its use. 


4N. Wiener, “What is information theory?” IRE Trans., vol. 
IT-2, p. 48; June, 1956. 

5$. Rice, ‘“Mathematical analysis of random noise,” Bell. Sys. 
Tech. J., vol. 23, pp. 282-332; July, 1944; vol. 24, pp. 46-156; 
January, 1945. 

6 C, Shannon, “A mathematical theory of communication,” Bell 
Sys. Tech. J., vol. 27, pp. 379-423; July, 1948; pp. 623-656; October, 
1948. 

7 J. Bussgang and D. Middleton, “Optimum sequential detection 
of signals in noise,’”’? IRE Trans., vol. IT-1, pp. 5-18; December, 
1955. 

8D. Middleton and D. Van Meter, “Detection and extraction 
of signals in noise from the point of view of statistical decision 
theory,” J. Soc. Indust. Appl. Math., vol. 3, pp. 192-253; December, 
1955; vol. 4, pp. 86-119; June, 1956. 

9A. Wald, “Statistical Decision Functions,’ John Wiley & Sons, 
Inc., New York, N.Y.; 1950. 

10 J, Kelly, ““A new interpretation of information rate,” Bell Sys. 
Tech. J., vol. 35, pp. 917-926; July, 1956. 


IRE TRANSACTIONS ON INFORMATION THEORY 


September 
Il. Tue UnpErtyInG MopEu 


Let us begin by constructing a simple model of one 
aspect of the general communication problem. It will be 
reasonably clear from what follows how more intricate 
models may be constructed to take account of more 
complicated systems. 

Consider a source S which produces at discrete times” 
a sequence of pure signals, together with noise, which may 
be of either stochastic or deterministic type, depending 
upon our further assumptions concerning the structure 
of the system. The combined signal is fed into a black 
box which we call a ‘communication channel,” which, 
in turn, emits a signal. This output signal is observed. 

On the basis of the observation of the output signal, 
it is desirable to make various deductions concerning the 
properties of the original pure signal. 

Schematically, 


Communication 


Source ————> —-> Observer. 


Channel 


hh 


~ 


mathematical terms, let 


x = the pure signal emanating from S. 


r = the noise associated with the signal. 
= F(z, r), the input to the communication system. 
y =the signal transmitted to the observer by the 
communication channel. (1) 
Let us further write 
y= Te) TE), (2) 


where 7 represents the transformation of the input signal 
x’ due to the communication channel. 

Consider the set of all communication systems, or, 
equivalently, the space of all associated transformations, 
T. We wish to introduce an ordering, or, what is much 
more satisfactory when possible, a metric which will 
enable us to compare two communication systems and 
to evaluate their performance (Section X-C). 

It must be stressed at the very outset of an investi- 
gation of this type that it should be possible to accomplish 
this aim in an unlimited number of ways, dependent upon 
the source, channel, nature of the observer, and the 
personal philosophies involved; 7@.e., upon the utility 
scales employed. 

In the following sections, we shall present two alternate 
methods for evaluating a communication system. Although 
each is a particular case of a more general scheme, whick 
we shall discuss subsequently, it is worthwhile to present 
them separately first, as they occur in important appli: 


4 The case of continuous signal emission can also be treated by 
the methods outlined below, at the expense of the introduction o 
more sophisticated concepts. We prefer to keep the mathematica 
level moderate in this first discussion; however, see Section X—D 
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ations. In this way, we hope to avoid the usual risk of 
oscuring the issue by extreme generality. 


Ill. A Srocuastic ALLocaTion Process 


Let us assume that the observer has a sum of money, 
r resources of other types, which we denote by the vector 
, called the state vector. Upon receiving a y signal, the 
bserver is required to make an allocation of resources 
» various activities. The effect of this allocation is to 
lange x into R(x, y), a stochastic vector whose distri- 
ution we shall assume here to be known. The case in 
thich the distribution is not known is closely allied with 
he second model we shall discuss. 

The process is now repeated N times, where NV may be 
ed, which is the simplest case, or the number of stages 
hay depend upon the process itself as a consequence of 
preassigned stop rule. Let us again consider only the 
implest case, that of fixed NV. 

Further, let us suppose that the purpose of the observer 
h carrying out this process is to maximize the expected 
lalue of some function of his final state vector, the state 
ittained after NV stages of the process. 

Let f, denote this maximum expected value, and f; the 
aximum expected value when the transformation 7' is 
ne identity transformation, the case in which we have a 
istortionless communication channel. 

| Let us then agree to measure the worth of the original 
jommunication system by means of a preassigned function 
ff, and f;. In this fashion we introduce a metric into the 
pace of transformations T, and thus, into the set of 
‘ommunication channels. The simplest cases are those 
where we use a function of f; — fr, or a function of f7/fr. 
_ We shall discuss a simple case of a process of the above 
type in later sections. For the formulation and mathe- 
atical discussion of some particular processes of this 
¢eneral type we refer to Robbins” and Bellman.” 


IV. A Srocuastic LEARNING PRocEsS 


_ Let us now consider a different type of stochastic 
process. The observer is required to make a decision con- 
terning the nature of the pure signal emitted by S. 
Subject to constraints imposed by the costs of observation 
and by limitations of time, he can observe as many 
samples of the signal emitted by the communication 
system as he wishes. 
| As a result of these decisions, he makes an estimate 
-oncerning properties of the pure signal, and thereby in- 
“urs a cost dependent upon the deviation of this estimate 
from the actual situation. 

The problem is to carry out the process of first obser- 
vation and then estimation so as to minimize the expected 


| 
: 12H. Robbins, “Some aspects of the sequential design of experi- 
iments,” Bull. Amer. Math. Soc., vol. 58, pp. 527-536; September, 
11952. 

|| 13. Bellman, “A problem in sequential design of experiments,”’ 
‘Sankhya, vol. 16, pp. 221-229; April, 1956. 
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total cost, where the total cost is a given function of the 
costs of observation and the cost of deviation. 

The theory of sequential analysis is devoted to one 
aspect of this general problem. Other aspects arise in the 
theory of learning processes.’°’”* 

It is clear that we can define the worth of the communi- 
cation channel in a manner completely analogous to the 
procedure discussed above. 


V. A More GENERAL PROCESS 


It is clear that both processes are particular cases of a 
more general process where neither the structure of, nor 
the transformation due to, the communication channel is 
completely known. Each stage of the process yields a 
certain return, which may be negative, in resources, and 
yields additional information, which may also be negative, 
concerning the intrinsic structure of the combined system. 

The problem is to carry out the sequence of decisions 
so as to maximize some function, which may not neces- 
sarily be completely known, of the total returns and the 
information pattern. 

It is interesting to observe that posed in this way, we 
encounter one of the basic problems of experimental 
research. 


VI. Discussion 


For the above approach to be fruitful and to represent 
more technology than tautology, one must possess mathe- 
matical techniques capable of formulating in precise terms, 
and of treating, processes of the kind described above. 

The theory of sequential analysis developed by Wald, 
Wolfowitz, Blackwell, and Girshick provides an approach 
to one class of problems of this type, while a general 
approach to these multistage decision processes is pro- 
vided by the theory of dynamic programming of Bell- 
mane 

As an application of these general methods, we shall 
consider a simple interesting model proposed recently by 
Kelly,’® and some generalizations. 


VIL. Tor Mopet or KELLY 


Let us begin by treating the first problem posed by 
Kelly. 

A gambler receives advance information concerning 
the outcomes of a sequence of independent sporting 
events over a nolsy communication channel. We assume 
that the outcome of each event is the result of play be- 
tween two evenly matched teams, and that p is the 
probability of a correct transmission, and g = (1 — p) the 
probability of an incorrect transmission. 

Assuming that the gambler starts with an initial amount 
x and bets on the outcome of each event so as to maximize 
his expected capital at the end of N stages of play, it is 
clear that he wagers his entire fortune on each play if 
p > 4, and nothing if p< 3. If p = 3, it makes no dif- 
ference what ‘policy he employs. (We are supposing that 
the gambler must bet on the received signal, if at all. It 
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is easy to see that if we allow him complete freedom in 
5, his bet will 
always be contrary to the information he receives.) If 
> < p < 1, with probability one the gambler will go broke 
following such a policy, since eventually he must lose a bet. 

A much more difficult process arises if we take p to be 
a fixed, but unknown quantity which must be determined 
on the basis of the observed results of betting. This leads 
to a “learning process.’’? An expository treatment con- 
taining a number of additional references may be found 
in Robbins,’* while a treatment by dynamic programming 
of a similar problem may be found in Bellman.”® 

Let us now assume that the above mode of play appears 
too hazardous to the gambler, and that he wishes to 
pursue a more conservative policy, one that will prevent 
him from ever being wiped out. He may then proceed to 
maximize the expected value of the logarithm of his 
capital at the end of N stages of play (see Section XI). 

For the one-stage process, he is faced with the problem 
of maximizing 


placing bets, then, in the case where p < 


E\(y) = p log («@ + y) + q log (a — y) (3) 


over all y in [0, x]. Here y is the amount wagered, even odds 
being assumed. It is easy to see that, if p > q, we have as 
the maximizing value of y, 


y=(- oe, (a 
and for that value of y 
E, = logx + log2+ » log p + q log q. (5) 


If p < q, the maximum is at y = 0. 

It is not difficult to show that if we consider N-stage 
processes. where we restrict ourselves to wagering policies 
which require the wagered amount to be a jfived pro- 
portion of the total capital at each stage, then the policy 
described above is optimal. This result was established 
by Kelly’’ in a very ingenious fashion. 

Let us now demonstrate that this policy is optimal 
within the class of all wagering policies. 


VILL. Economic ForEcASTING 


It is clear that the above mathematical model is ab- 
stractly identical with problems that arise in connection 
with economic forecasting, in particular, and with fore- 
casting, in general, as for example, weather prediction. 

In these cases, the physical world is the source and the 
scientific corps, both experimental and theoretical, the 
communication channel. Sometimes, the theorist or 
experimenter is also the observer; at other times, it is the 
business man or politician who must decide to what 
extent he trusts his communication channel. 


IX. Dynamic ProGRAMMING APPROACH * 


Let us begin by formulating the problem in dynamic 


14The results contained in this section answer the fundamental 
question posed by Kelly, op. cit., p. 926. 
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programming terms. Define the following sequence 0} 
functions, 


f(a) 


expected value of the logarithm of the fina 
capital obtained from an WN-stage process 
starting with an initial capital « and using an 


optimal policy. (6) 


An optimal policy is here defined as one which maxi- 
mizes the expected value of the logarithm of the capita 
at the end of NV stages. 

The mathematical results we shall obtain depend upon 
the intuitive principle of optimality: An optimal policy has 
the property that, whatever the initial state and initial decision 
are, the remaining decisions must constitute an optimal 
policy with regard to the stale resulting from the first decision. 

For this particular process, the meaning of the state- 
ment is that, regardless of how much is wagered by the 
gambler at the first stage, and regardless of whether he 
wins or loses, he utilizes the amount of money in his 
possession prior to the betting at the second stage in 
such a way as to maximize; 7.e., he proceeds so as to maxI- 
mize the expected value of the logarithm of his capital 
after the remaining (NV-1) stages. 

The mathematical transliteration of this statement 
yields the recurrence relations 


fila) = loga+K, 
fw(x) = Max [pfy-s(% + y) 


O<ysax 


7) 


“+ Cp) frei) an Naess 
where 
x — Joe2+plogp+aqloe¢ p>a og 
lo, (De) 
Let us now demonstrate the 
Theorem: For N > 1, we have 
f(a) = loga + NK, (9) 


where K is defined as above. The optimal policy is unique 
and independent of N. It consists of choosing 


(10a) 
(10b) 


(ae, Diag: 


0, 


U = 


Y— 1 Sb 


Proof. Let us proceed inductively, beginning with the 
known result for N = 1. Assuming that the result holds 
for N, we have, 


fr+i(t) = Max [p [log (x + y) 


O<y<z 
UN IG) ep) Oey aa 
= Max [p log (« + y) 


O<ySz 


ECL =p) log (rig) act (hu 


{ +(t) = (loga + K) + NK = log x 


REN eae (12) 


|| The statement concerning the form of the optimal 

jiolicy follows from the analytic form of fy(x). 

|| Now that the “best” performance of the noisy channel 

ie been determined, it may be compared in various 
ossible ways with the performance of a perfect channel. 


X. GENERALIZATIONS 
I. Time Dependent Case 


Before proceeding to more general cases, let us consider 
simple extension of the above model. 

First, let us suppose that at the kth stage the probability 
Af correct transmission is p,, and of incorrect transmission 
|, = 1 — p,. For fixed N, define the sequence of functions 


f.(a) = expected value of the logarithm of the final 
capital obtained from the remaining k stages 
of the original N-stage process, starting with 
an initial capital x, and using an optimal 


policy. 

Then 
fiz) = Max {pw log (x + y) + qw log (x — y)}, 
| O<y<z 
(x) = Max (Pn—e+1 eee (x St y) 
| O<y<z 

nee we (Gt me ON ek 2: (14) 
| As before, it follows inductively that 
F(x) = log x + k log 2 

N 
+ 2 ip logp, +4, loga], (15) 

provided that p, > 4 fork = 1, 2,--- , N. Whenever this 


sondition fails, the term p, log p, + q log gq must be 
eplaced by ( —log 2). 


B. Correlation 


Let us now consider the case where the signals are not 
ndependent. The simplest case, perhaps, is that where 
the probability of correct transmission p, depends upon 
hether or not the preceding signal was transmitted 
correctly. Although a large variety of questions of this 
type may be formulated, we feel that the following dis- 
ussion will illustrate the uniform method which may be 
employed to treat them. 


‘Let 
p,. = probability of correct transmission of the kth 
signal, if the (k — 1)st signal was transmitted 
correctly. 
r, = probability of correct transmission of the kth 
signal, if the (k — 1)st signal was transmitted 


incorrectly. (16) 
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Define the sequence of functions, 


f.(x) = expected value of the logarithm of the final 
capital obtained from the remaining k stages 
of the original N-stage process, starting with 
an initial capital z, and the information that 
the (k — 1)st signal was transmitted correctly, 
using an optimal policy. 

g.(x) = the corresponding function in the case where 
the (k — 1)st signal was transmitted incorrectly. 

(17) 
Then, as above, 
f(z) = Max [py-nsifralz + 9) 
O<su<az 
ae é! = Du—K+1) Ju—1(% <s y)| 
g(x) = Max [ry-niifx—i(z + Y) 
O<y<ar 
Se le Sty a) Gee ee (18) 
It follows inductively, as before, that 
filx) = logz + a, 
gAx) = logx + hy, (19) 


where the recurrence relations for the a, and b, are readily 
established. 


C. Multiple Signal Channels 


Let us now consider the situation in which the channel 
is called upon to transmit any of M different signals, with 
the probability of correct transmission dependent upon 
the signal transmitted. 

Assume that the gambler possesses the following infor- 
mation: : 


p;; = the conditional probability that the 7 signal was 
sent if the 7 signal was received. 

q: = the probability of receiving the 7 signal at any 
time. 

r; = the return from a unit winning bet on the 7 
signal. (20) 


For simplicity of notation principally, we assume that 
there is no time dependence. It will be readily seen from 
what has preceded and what follows that this factor 
could easily be included. 

The process proceeds as follows. Upon receiving a 
signal, say the 7 signal, the gambler is free to bet an 
amount z,; that the signal actually sent was a 7 signal. 
The restrictions upon z; are 


2; 20 (21a) 


(21b) 


where x is the current quantity of capital. 

As before, we assume that the purpose of the process 
is to maximize the expected value of the logarithm of 
the capital remaining after N stages. Let us then define 
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the function fy(x), for N Ls 25, =e Sandie eC) are esate 
expected value of the logarithm, starting with a capital 
of x, and using an optimal policy. 


Observing that there is a probability p;; of winning 
r,z; and losing on the other bets when the 7 signal is 
received, we see that with probability p;; the gambler 

. M n . 
has the quantity r;z; + « — >°™, z, left after one stage 
when the 2 signal is received. 

Consequently, employing the principle of optimality as 
above, we obtain the recurrence relations 


M M 
fr(z) = >> a4Max Do DE 
~=1 R(x) g=1 
M 
tualray a ED > «)} ) N22 
s=1 
M M 
file) = >> a4Max De 
t=1 R(x) 7=1 
M 
- log (ra +a4—- >} (22) 
s=1 
Here R(x) is the region in z; space defined by (21). 
Let us now prove inductively that 
fixe) = lor a -E NK (23) 
where K is a constant independent of « given by 
M M M 
Kee a4Max py log (rz, +4 — > «.)}. (24) 
t=1 R(1) 7=1 s=1 
The result is true for N = 1, as we see upon setting 


N 1 in (23) and comparing with (22). Assume that 
the result holds for N — 1 and substitute the expression 
for fy-1(@) given in (23) in (22). Collecting terms, we 
obtain the expression in (23) for NV. 

From the expression for / it is clear that the optimal 
policy depends only on p,; and r; and not on the 4q;, 
though the return itself does also depend on q;. An 
interesting special case is that in which it is required that 
yo \2; = x; that is, the gambler is required to wager all 
of his available funds. In this case the optimal policy 
depends only on (p;;); ¢.e., on the communication channel, 
and not at all on g; or on 7,. 


Schematically 


este «|: 


If we now introduce the quantities 


p: = probability of sending an 7, 
t;; = the conditional probability that if an 7 is sent, 
then 7 is received, 
we then have 
M 
Di = 2. UPis 
M 
GQ: = Qi ditis, 
7=1 
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and (t,;) is the inverse of (p;,;). Consequently, in this case 
the optimal policy is dependent only on (¢,;), whieh 
characterizes the communication channel and is inde- 
pendent of both the source characterized by p; and the 
outside world, characterized by the odds, r;. The return 
however, does depend on all these quantities. 

These considerations are significant, for they imply 
that the gambler’s actions are controlled solely by the 
quality of the communication channel, though his ulti 
mate return is determined by the situation 7m toto. This 
leads to the possibility of comparing two channels undet 
the same conditions or of evaluating the performance ol 
a given channel under various conditions. 

Specializations to the unsymmetric binary channel are 
immediate. 


D. Continuum of Signals 


Consider now the case where there is a continuum of 
different signals. Let 


dG(u, v) = the conditional probability that a signal 
with label between v and v + dv is sent if 

the wu signal is received, —©7 < u,v < @, 

(25) 


and let 


dH(u) = the probability that a signal with label 
between u and u + dw is received at any 
stage. (26) 


Then, considering the process corresponding to the 
special case discussed above, even bets being assumed 
for the sake of simplicity, we derive the recurrence relations 


fx(x) = is | Max ic fv-1(2z@)) dG(u, »| dH(u), 


Oe ee [ Mas ie log (2z(v)) dG(u, | dH). “Ga 


In both cases, the maximization is over all functions 
z(v) satisfying the conditions 


20) = 0 (28a) 
ib i 20) dv = x. (28b) 
As above, it is easily seen inductively that 

fx(x) = log 2a + KN, (29) 

where 
Kos : Neh he log 2(v) dG(u, »| dH (u), (30) 

and 

20) = 05 (31a) 
i: 20) dle (31b) 


XI. Criterion Functions YIELDING 
INVARIANT PoLicins 


We have seen above that the linear function yields an 
ariant policy at each stage, and likewise the logarithm. 
is of interest to determine all criterion functions pos- 
ssing this property. 

The following version of the problem will be treated 
re. Let ¢ (x) be a monotone increasing concave function 
fined over 0 < x < 1. Consider the one-stage process 
rere we wish to maximize 


Ey) = pox + y) + 1 — pola — y). (32) 


e function L(y) is concave as a function of y for 0 < y 
x,0 <x < 1, and thus has a unique maximum, unless 
x) is linear and p = 3. Let us dismiss the case of lin- 
irity by requiring strict concavity, ¢’’(x) < 0, and take 
>. 

‘Let us assume that, for all 2 inO0 < x < 1, there isa 
blution of 


dE 


Ge ama Leet a y= 0. 32) 


ving the form 


y = r(p)x, (34) 


here r(p) is a nonnegative differentiable function of p 
br + < p < 1, possessing a continuous derivative. 
Then (33) is equivalent to the functional equation 


b/(a(L + r@)) = (1 — p)o’@(L — r@))), 


for, et see So ee (35) 
Let <1 + r(p)) = y. Then (35) reduces to 
P ii 2 BY yC ae so) 
oy) = 6( a), ao 


‘We now differentiate first with respect to y, and then 
ith respect to p, obtaining the two equations 


Boo = (arb ae) 


it ! d eS r(p) yr y( rea r(p)) } 
a— pt = Yap ( oa ( eo) 87) 
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Dividing the two equations, we obtain 
yo"'(y) u(p) 
; a , 38 
wW) ~ pl — p\du/ap) oe 


where u(p) = (1-r(p))/(1. + r(p)). 
Since the left side is a function of y and the right side a 
function of p, both sides must be constant. Setting 


ee SCE A (39) 
we obtain 
log ¢’y) = K logy + a. (40) 
Hence 
¢’(y) = cy”. (41) 


Without loss of generality, let us normalize, so that 
¢’(t) = 1. Then 


$y) = y" (42) 
If K > — 1, we have 
K+1 
oo) =eaqt% (43) 
If K = —1, we have 
oy) = logy +e’. (44) 


It is clear that K > —1 is necessary for ¢(y) to be non- 
negative for y > 0. Finally, without loss of generality, 
we can let c = “ch = 0: 


XII. Discussion 


In the foregoing pages, we have essayed to describe 
some applications of the concepts and techniques of the 
theory of dynamic programming to various aspects of 
communication theory. As simple illustrations we have 
considered a particular process discussed by Kelly and 
various generalizations. In subsequent papers, we pro- 
pose to treat in greater detail some mathematical models 
of greater scope. 
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Complex Processes for Envelopes of Normal Noise’ 


RICHARD ARENSt 


Summary—The paper presents a brief exposition of the tech- 
nique of complex normal random variables as utilized in the study 
of the envelopes of Gaussian noise processes. The central concept 
is the pre-envelope z( ) of a real normal process. The pre-envelope 
z( ) of a real function x( ) is a complex function whose real part 
is x( ) and whose absolute value is the envelope, in the sense of 
high-frequency theory, of x( ). The joint probability density for 
z(t), z(t) is found and used to get the threshold crossing rate. 
Consideration of nonstationary processes is included. 


I. InTRODUCTION 


HE envelope r(t) of a real function x of one variable 
(“time’’) is obtained, according to one concep- 
tion'’” of the envelope, by forming the conjugate 

(Hilbert transform) function y of a and taking the abso- 
lute value | 2(¢)| of z(t) = a(t) + zy(t). We call the function 
z the pre-envelope of x. Its spectrum is supported wholly 
by one half the frequency axis; we choose the positive 
half. When two “signals” 2, 2. are superposed, the 
pre-envelope of 2, + x, is the sum of the pre-envelopes 
21, 2 Of 2, X2, respectively; and more generally if x is 
filtered with output at time ¢ 


/ Wt — sx(s) ds (1) 
0 
then the pre-envelope of the output is at time 1 
| Wo = das ds (2) 
0 


where z is the pre-envelope of x. In dealing with linear 
systems it is evidently better to utilize the pre-envelope, 
because then at each stage between successive filterings 
the envelope can be readily inspected. (This procedure is 
often employed for aesthetic reasons independent of 
envelopes.) We shall, however, be concerned with random 
processes rather than a single time series in this paper. 

Accordingly, let x be a sample function from some real 
normal (so-called ‘‘Gaussian’’) stationary process. Form 
z = x + ty where z is the pre-envelope of x. This gives 
rise to a new random process which is stationary, normal 
and, of course, complex. We first prove this and then 
illustrate the utility of such complex processes by calcu- 
lating the “alarm rate’? (see Section V) or threshold- 
crossing rate for a ‘‘noise”’ envelope. Our object is, however, 
more to exhibit the technique and illustrate the principles 


* Manuscript received by the PGIT, February 26, 1957. 

+ Dept. of Math., Univ. of Calif., and RCA, Los Angeles, Calif. 

18. O. Rice, ““Mathematical analysis of random noise,” Bell. Sys. 
Tech. J., vol. 23, p. 282; 1944, and vol. 24, p. 109; 1945. 

2V.1. Bunimovitch, ‘The fluctuation process as a vibration with 
random amplitude and phase,” J. Tech. Phys., USSR, vol. 14, p. 
1231; November, 1949. 


of complex valued random processes. Our experience ha 
been that such technique is practically indispensable fe 
treating envelope questions for nonstationary processes 
and considerably more convenient for arriving at th 
known results than the known treatment of the stationar 
case. 


Il. THe Pre-ENVELOPE Process EXAMINED 


Bunimovich*® has pointed out that if we have a ree 

valued function, 

x) = | exp Qmifd) dX(f), ( 
where X is of bounded variation and dX(—f) = dX(q 
since xv is real, then the absolute value of 
a(t) + ry() = ad) 

aN5 / exp (2maft) dX(f) + AX (4 

O+ o- 


is the Rice* envelope of (3). It is clear that z is linearl 
determined by x. Hence z is also’ a normal, stationar 
process, although, as pointed out earlier, a complex one 
Here we confine ourselves to the calculation of the aut 
correlation for the z process, which is by definition (# = 
expected value or expectation), 


LEle(He(s)] = R(t — 8), G 


a justified notation because it depends only on t — s. W 
aim to express #,(t) in terms of the integrated powe 
spectrum’ W for the x process, in terms of which th 
autocorrelation of the latter is given by 


Ee@s@ eee ie cos Dni(? 1 5) dW Can 


We expand the cosine, obtaining briefly an array cos cc 
+ sin sin, where the first factor in each case depends on 
and the second on s. For the expectation of a(t)y(s) w 
remark that the expectation of a linear functional is th 
linear functional of the expectation.’ The linear function: 


3 bid. 

* Rice, op. cit. The author gratefully acknowledges that this w: 
pointed out by Prof. J. Dugundji. 

5 J. L. Doob, “Stochastic Processes,’ John Wiley & Sons, Inc 
New York, N.Y.; 1953 and A. Khintchine, ‘‘Korrelationstheor 
der stationdren stochastischen Prozesse,’’ Math. Ann., vol. 10! 
p. 604; 1934. 

5 Rice, op. cit. 

7In practice this reduces to interchanging the order of sever 
limit-operations. In what follows we do not trouble to justify th 
interchange of operations, because our purpose is mainly expositic 
(see footnote 10). 


re intended is the Hilbert transform applied to the 
cond appearing function 2, in (6). Thus the expectation 


| placed by cos sin — sin cos, since the Hilbert transform 
j) cos is sin and that of sin is —cos. Operating, instead, 
|| the t-dependent factors gives the expectation of y(t) 
hs) with the array sin cos + cos sin, and for y(t)y(s) we 
tain the array sin sin cos cos. Adding the integrals for 
Z)x(s), y(t)y(s), —2x(t)y(s), and 7zy(t)a(s), and dividing 
r 2 gives the quantity called for in (5), whence 


R(t — 8) = Ne exp (277 f(t — s)) dW(f). (7) 


III. Comptex Norma Distriputrions 


|/ Just as in vector analysis, there is a point in introducing 
}mplex processes only if in subsequent operations, no 
rther reference to the real components is necessary. 
) order to gain such facility, some basic ideas have to be 
ept in mind. The probability density F’( ) of a complex 
ndom variable z is a real function defined in the complex 
zane. Also it is nonnegative and has integral one over 
e plane. If, for example, 


1 —2e/ace 
é 
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F@) = (8) 
ken we call z a normally distributed random variable 
ith zero mean and semivariance o”. This latter is half 
he mean (/) of 22 = | z |. The probability density for 
= | z| can be obtained from (8) by changing to polar 
yordinates and integrating over the angle and takes the 
orm. 

i) =o * exp(—2) ¢ 7°) 


(r 2 0), (9) 


”) 


| eae called ‘Rayleigh type with parameter o”’. 
w is a complex normally distributed random variable 
ith mean m which may, of course, be complex, then 
= m+ z where z is distributed by (8). The probability 
ensity forr = | w! can be obtained’ via polar coordinates, 
d is 


| Fale) = ro *Lolro™*u) exp [2°10 *" +) (10) 


there r > 0, and » = | m |. This corresponds to the dis- 
ribution of the envelope of signal plus noise. u is the 
4gnal envelope and o” is the mean square of the noise. 
|The typical jointly normal probability density for n 
pmplex random variables is given by 


Tier, BEoms 


a i 


yhere q;; = g;, and C = (27) ” X determinant of (q;;). 


8 J. L. Lawson and G. E. Uhlenbeck, ‘Threshold Signals,”’ 
M.I.T. Rad. Lab. Ser.. McGraw-Hill Book Co., Inc., New York, 
V.Y., vol. 24, p. 154; 1947. 
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,2n) = C exp ep 3 wees) (11) 
a 
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The inverse (r;;) of the matrix (q,;) has this property” 


(12) 


Ue 2H (zz; ]. 


IV. Rerurn To tHE PRE-HNVELOPE PROCESS 


In a possibly nonstationary, normal process we have a 
joint probability density like (11) for the values 2(¢,) 
=2,,-:- , 2(t,) = z, of a sample function 2, and the 7r;, 
of (12) takes the form 


r(t;, ti) = SHla(t,e(t,)] (13) 


where r is a function of two variables characterizing the 
process. If the process is stationary, this depends only on 
i; — t; and may be written r;; = r(@;, ;) = BiG — &). 
In view of (5), this R, is given for the pre-envelope proc- 
ess by (7). 

If a process characterized by (13) (¢.e., by specifying 7) 
is subjected to a linear operation w = L(z), then w is 
again a normally distributed complex random variable, 
provided there are no’? convergence difficulties in the 
carrying out of the operation ZL. In carrying out the 
operation ZL on the function z, there will usually occur a 
dummy variable t, as in the examples 


{4 “0] 


i A(te(t) dt. 


L(2) (14) 


and 


Or = (15) 


These and similar functional operations may be written as 


(16) 


Naturally, the ¢ in (14)-(16) may be replaced by any 
other letter without altering the significance, hence the 
name dummy variable. Supposing that the questions 
alluded to’ do not impede us, then” 


L@) = Lyle]. 


9 These facts can be established by starting with a characteristic 


function ‘ 
ofr, ee Gc) = exp (=27-Q) 
where Q = Yrijfit;(rij = F):). The Fourier relation between ¢ and 
F is that 
PAC gre ear ee) | Fe exp 7B dz, --+ dyn 


where B = D(xrpE_, + Yin rk); Ze = Cert ey Se =Ext 2 ky and the 
integration is over R?”. The details are easily adapted from the real 
variable case as handled in H. Cramér, ““Mathematical Methods of 
Statistics,” Princeton University Press, Princeton, N. J.; 1946. 

10 This seems to be the case in applications. We refer to Doob, 
op. cit., Khintchine, op. cit., V. Cramér, ‘‘On the theory of stationary 
random processes,” Ann. Math., vol. 41, p. 215; 1940, and A. Blanc- 
Lapierre and R. Fortet, ‘Theorie des Fonctions Aleatoires,” Masson, 
Paris; 1953, for the mathematical theory. 

1 A bar placed over the L in our examples means that the complex 
coefficient a in (9) and the function A in (10) have to be replaced 
by their conjugates. 
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2B [Lyle Lo e)]] 
= Lali Ee Oe)]] 
= Lye Legit to )l: 

Thus the probability density of w = L(z) ts given by 


(2r)~'7*? exp [—27'7r? | w |?] (17) 


where r° ts given by 


7 (=38|| WwW [*)) = Lies Laaslttt, s)]. (18) 


The full utility of the complex method (examples in 
Section V) is not attained until also two or more linear 
functions of a given process z can be treated. Such a 
generalization of (18) is easily obtained. Let the linear 
functionals be 


UW, = IGAPAR 2 a , Wr = les (19) 
Then, analogous to, and generalizing (18), 
28 [bw] = Lic Lic (rt, )]. (20) 


If these numbers are taken as the r;; and the q;; calculated 
in terms of them, then (11) gives the joint probability density 
of the variables (19). Of course the z2’s in (11) have to be 
replaced by w’s. 


V. ALARM RATE FOR THE ENVELOPE 


By this we mean the following. Let x be a stationary 
normal process specified by an F# as in (6). Let P(T) be 
the probability that the envelope will rise above a level V 
in a certain interval of time T, 7.e., 


P(T) = prob. {| 20) | < V < la) ]}. 20 


If for small 7’ we have 
P= pl erg 


where g tends to 0 faster than 7’, as T’ tends to 0, then p is 
the alarm rate associated with the level V. Now (21) 
could be evaluated by setting up the bivariate distri- 
bution (11) again mentioned in Section V and integrating, 
but this labor is unnecessary. 

Rice’ solves this problem by using the joint distri- 
bution for | 2(¢)| and d/dt | z(¢)|. We shall, however, use 
the joint distribution of the two linear functionals 


le@].-0, 
w, = Mle] = [2’O]i-0 


and hope that our performance may encourage the reader 
to use complex variables for other envelope problems. 

We actually do not need stationarity; so let the z 
process be specified as by (13). [We recapitulate: if the 
datum 7s'a real process such as (6), we form first (7) and 
take r(é, s) = R,(s — t).] With the r,;; to use in (11) for 
W,, W2(7,7 = 1, 2) we have, according to (20), the following 
array: 


Dey = 


Ua 


(22) 
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Seplembe 
) 
(8) ea | 2 nt o| 


F) om 
E A a]. E ai ol. 


7.e., we have the automatically hermitean matrix 
r(0, 0) 72(0, 0) 
rOx0) 112(0, 0) 


where the indices in (23) denote partial derivatives. Le 
the determinant of (23) be called D. Then the znverse is 
omitting the (0, 0)’s from (23), 


79 if Tye ity \\ 
aay (| ie 


The joint probability density for 2(0), 2’(0) is, therefore 
F(z, w) = D(2n)” exp (—27'Q) (26 


where Q = D7‘(ris| 2 |? — 2Rr,wz + r | w |’), R denotin 
“real part.” 
In the stationary case, (23) takes the form 


RO) +£82(0) 
—R0)  —R2"(0). 


(23 


(24 


(26 


Reference to (7) discloses that 


RAO) = i ; dW(f) = R(O) = “mean power” 


RAO ii  FaWGye Olean 
and 
R’'(0) = —40° ig fp aw(f) <0. 


The convergence of these integrals would seem to be th 
only assumption needed to justify the result. 

Proceeding from (25), a kinematic consideration pre¢ 
vides a short cut to the alarm rate. We choose to thin 
of (25) as giving the position and velocity distribu 
tion of a gas in the plane; and we desire the rate ¢ 
which matter diffuses out of the circle of radius V abot 
0. Consider the rate of diffusion over an element h of a1 
of this circle. The rotational invariance of the distributio 
makes this rate independent of where on the circle h - 
placed, and so we place it at 2 = V. We may think of - 
as a small vertical segment’” whose endpoints are V, 
+ th. The particles that will cross with velocity w in on 
second are those’” in the parallelogram of area uh, di 
termined by fh and the vector w = uw + w, provided 
> 0, and hence have relative mass F(V, w)uh. Integratin 
overall relevant (wu, v), we obtain 


h / / PV, Liou du dy. (2 


2 To a suitable degree of approximation, of course. 


‘his is the rate of diffusion of mass over an arc of length h 
the circle. Multiplying by 21V/h yields the diffusion 


iat a particle will escape from the circle per unit of time; 
nd this is, of course, the alarm rate. 

It remains to evaluate (27). We set z = V,w = 
1 (25), and the alarm rate is readily evaluated as 


fr’ exp (—2°'r-*V*)[(D/2ar)* exp (—27'r' Vr) 
+ Vr'Rr, erf (VRr,(Dr)*)] 


UN 108 


(28) 


rhere 


erf (y) = Qn)? [ exp (—2>'x”) dz. 


In the stationary case, there is considerable simplifi- 
ation because the real part Rr, of r, = R/(0) is 0, and 
e have nothing but a constant, (D/2zr)4, multiplied by 
e envelope’s probability density at V [see (9)]. This is 
tice’s result."° 


VI. Mosiue FitrEers 


' The generality of (28) in going beyond the stationary 
ase might have been pointless were it not for the fact 
nat ‘Gaussian noise’? may be encountered which is not 
ationary. Of course in that case a pre-envelope process 
mnnot be constructed via (7). A consideration of the 
Henesis of such noise is helpful, however. It usually arises 
Fecause stationary noise is sent through a mobile (= non- 


} 


1 Rice, op. cit. See (4.8). 
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stationary) device. It often happens that a formula of 
the type 

/ A; (s\x(s)-ds' = 20) (29) 
can be discovered such that for a real input x(t), z(t) 
represents the pre-envelope for the output, 7.e., z(¢) is the 
real output’* and | 2(é) | is the envelope of the output. 
Then the function (13) for the pre-envelope of the output 
is as follows 


fis) = 14 [| x(u)x(v) A (u) Av) du iv| 


(30) 


+ | Rei 2 oy nee 
If A, has a Fourier transform 
on = / A,(v) exp (—2zifv) dv 


then the Fourier inversion formula enables us to write 
also 


rts) = 4 f eo) awe). G1) 
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